Article
Biochemistry & Molecular Biology
Hasan Zulfiqar, Shi-Shi Yuan, Qin-Lai Huang, Zi-Jie Sun, Fu-Ying Dao, Xiao-Long Yu, Hao Lin
Summary: This study developed a computational model to discriminate cyclin proteins from non-cyclin proteins with high accuracy. By encoding and optimizing protein sequences with seven features, and training a gradient boost decision tree classifier, the model achieved better results than previous studies on the same data.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2021)
Article
Biochemistry & Molecular Biology
Leqi Tian, Wenbin Wu, Tianwei Yu
Summary: Random Forest (RF) is a popular machine learning method for classification and regression tasks, and it performs well under low sample size situations. However, there are issues with gene selection using RF as the important genes are usually scattered on the gene network, which conflicts with the biological assumption of functional consistency. To address this issue, we propose the Graph Random Forest (GRF) method, which incorporates external topological information to identify highly connected important features. The algorithm achieves equivalent classification accuracy to RF while selecting interpretable feature sub-graphs.
Article
Genetics & Heredity
Bin Liu, Ziman Yang, Qing Liu, Ying Zhang, Hui Ding, Hongyan Lai, Qun Li
Summary: This study proposes a machine learning model based on multi-feature fusion to efficiently predict allergenic proteins. By extracting and optimizing protein sequence features, the model can accurately distinguish allergenic proteins from non-allergenic proteins, providing guidance for users to identify allergenic proteins.
FRONTIERS IN GENETICS
(2023)
Article
Environmental Sciences
Hao Fei, Zehua Fan, Chengkun Wang, Nannan Zhang, Tao Wang, Rengu Chen, Tiecheng Bai
Summary: This study proposes a county-scale cotton mapping method based on multiple features and random forest. By selecting spectral features, vegetation indices, and texture features, and exploring the contribution of texture features to cotton classification accuracy, the study improves the accuracy and efficiency of cotton classification.
Article
Biochemical Research Methods
Yunpei Xu, Hong-Dong Li, Cui-Xiang Lin, Ruiqing Zheng, Yaohang Li, Jinhui Xu, Jianxin Wang
Summary: Single-cell RNA sequencing (scRNA-seq) is a powerful tool for dissecting biological tissues. This study presents CellBRF, a feature selection method that considers the relevance of genes to cell types, improving single-cell clustering accuracy. CellBRF outperforms existing methods in terms of clustering accuracy and consistency, and has been successfully applied to cell differentiation stage identification, non-malignant cell subtype identification, and rare cell identification.
Article
Biology
Elahe Nasiri, Kamal Berahmand, Mehrdad Rostami, Mohammad Dabiri
Summary: Predicting interactions in protein networks is crucial for biological processes, and computational methods have been used, including the popular Deepwalk algorithm, to predict protein interactions. This study focuses on link prediction in attributed networks by modifying Deepwalk with feature selection for more accurate prediction of protein-protein interactions. Results show that this approach outperforms other methods and improves prediction accuracy.
COMPUTERS IN BIOLOGY AND MEDICINE
(2021)
Review
Biochemistry & Molecular Biology
Haoyu Chao, Yueming Hu, Liang Zhao, Saige Xin, Qingyang Ni, Peijing Zhang, Ming Chen
Summary: This review provides an overview of the biogenesis, biological functions, and interactions of five major regulatory ncRNAs (miRNA, siRNA, tsRNA, circRNA, lncRNA) in plants. It also summarizes the tools and databases for analysis and prediction of plant ncRNAs, and presents a step-by-step computational analysis protocol for these ncRNAs.
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2022)
Article
Biochemical Research Methods
Jing Sun, Li Pan, Bin Li, Haoyue Wang, Bo Yang, Wenbin Li
Summary: This study proposes a method for constructing protein-protein interaction networks by selecting relevant features instead of continuous and periodic features to improve the accuracy of identifying essential proteins.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2023)
Article
Health Care Sciences & Services
Valeria Maeda-Gutierrez, Carlos E. Galvan-Tejada, Miguel Cruz, Jorge I. Galvan-Tejada, Hamurabi Gamboa-Rosales, Alejandra Garcia-Hernandez, Huizilopoztli Luna-Garcia, Irma Gonzalez-Curiel, Monica Martinez-Acuna
Summary: The study constructed a predictive model to identify risk factors for diabetic retinopathy in the Mexican population. The model achieved 69% AUC and identified factors such as creatinine and lipid treatment as significant predictors. A risk evaluation method was also implemented to assess the impact of these factors.
JOURNAL OF PERSONALIZED MEDICINE
(2021)
Article
Environmental Sciences
Cuixia Wei, Bing Guo, Yewen Fan, Wenqian Zang, Jianwan Ji
Summary: There are significant differences in the dominant driving factors of the change process of different types of wetlands in the Yellow River delta. This study extracted and analyzed the change process of wetlands and their dominant factors using the Random Forest algorithm. The results showed that the wetlands were transformed into artificial wetlands, with an overall development direction of northwest-southeast.
Article
Environmental Sciences
Yimo Liu, Wanchang Zhang, Zhijie Zhang, Qiang Xu, Weile Li
Summary: This study aims to optimize landslide susceptibility model prediction by integrating landslide conditioning factors with Q-statistic. The study was conducted in Atsuma Town, which suffered from landslides following the Eastern Iburi Earthquake in 2018. Four subsets were selected for training and validating the model, showing that optimized conditioning factors can effectively improve the prediction accuracy of landslide susceptibility mapping.
Article
Health Care Sciences & Services
Aneta Polewko-Klim, Krzysztof Mnich, Witold R. Rudnicki
Summary: A hybrid protocol integrating clinical and molecular data was proposed for more effective classification of cancer patients. This method yielded promising results in predicting clinical endpoints for breast cancer and urothelial bladder carcinoma samples.
JOURNAL OF MEDICAL SYSTEMS
(2021)
Article
Chemistry, Multidisciplinary
Li Yang, Tianyi Liu, Weijian Ren, Wenfeng Sun
Summary: The study applied random forest algorithm and fuzzy neural network to address the coupling problem of rate of penetration prediction in drilling engineering. By using K-means to divide data into fuzzy sets, the fuzzy neural network was trained with improved effectiveness.
Article
Forestry
Lukasz Pawlik, Janusz Godziek, Lukasz Zawolik
Summary: This study statistically modeled forest damage caused by the Klaus windstorm in southwest France and identified the best predictors for predictive models. The random forest technique was used to find the best classifiers and the resulting model was used for spatial prediction and probability mapping.
Article
Construction & Building Technology
Armin Kassemi Langroodi, Faridaddin Vahdatikhaki, Andre Doree
Summary: The monitoring and tracking of construction equipment is crucial for improving project efficiency, safety, and sustainability, with recent advancements in digital technologies being leveraged for developing monitoring systems. Research suggests that deep learning methods are becoming popular for equipment activity recognition.
AUTOMATION IN CONSTRUCTION
(2021)