4.5 Article

Hepatitis C Virus prediction based on machine learning framework: a real-world case study in Egypt

期刊

KNOWLEDGE AND INFORMATION SYSTEMS
卷 65, 期 6, 页码 2595-2617

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s10115-023-01851-4

关键词

Machine learning; Classification; Feature selection; Hepatitis C Virus

向作者/读者索取更多资源

This study proposes a prediction framework based on machine learning methods to predict Hepatitis C Virus among healthcare workers in Egypt. By utilizing real-world data and performing feature selection, the framework effectively predicts virus infection and improves classification accuracy.
Prediction and classification of diseases are essential in medical science, as it attempts to immune the spread of the disease and discover the infected regions from the early stages. Machine learning (ML) approaches are commonly used for predicting and classifying diseases that are precisely utilized as an efficient tool for doctors and specialists. This paper proposes a prediction framework based on ML approaches to predict Hepatitis C Virus among healthcare workers in Egypt. We utilized real-world data from the National Liver Institute, founded at Menoufiya University (Menoufiya, Egypt). The collected dataset consists of 859 patients with 12 different features. To ensure the robustness and reliability of the proposed framework, we performed two scenarios: the first without feature selection and the second after the features are selected based on sequential forward selection (SFS). Furthermore, the feature subset selected based on the generated features from SFS is evaluated. Naive Bayes, random forest (RF), K-nearest neighbor, and logistic regression are utilized as induction algorithms and classifiers for model evaluation. Then, the effect of parameter tuning on learning techniques is measured. The experimental results indicated that the proposed framework achieved higher accuracies after SFS selection than without feature selection. Moreover, the RF classifier achieved 94.06% accuracy with a minimum learning elapsed time of 0.54 s. Finally, after adjusting the hyperparameter values of the RF classifier, the classification accuracy is improved to 94.88% using only four features.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Information Systems

The effect of choosing optimizer algorithms to improve computer vision tasks: a comparative study

Esraa Hassan, Mahmoud Y. Shams, Noha A. Hikal, Samir Elmougy

Summary: This article discusses the application of optimization algorithms in machine learning to improve model accuracy, and provides a detailed introduction to various optimization strategies and their complexities. The results of the study demonstrate that using the appropriate optimizer can significantly improve the performance and accuracy of machine learning models.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Article Green & Sustainable Science & Technology

A Novel WD-SARIMAX Model for Temperature Forecasting Using Daily Delhi Climate Dataset

Ahmed M. Elshewey, Mahmoud Y. Shams, Abdelghafar M. Elhady, Samaa M. Shohieb, Abdelaziz A. Abdelhamid, Abdelhameed Ibrahim, Zahraa Tarek

Summary: This study uses the Daily Delhi Climate Dataset and time series forecasting techniques to predict the temperature in Delhi. A hybrid forecasting model, combining Wavelet Decomposition (WD) and Seasonal Auto-Regressive Integrated Moving Average with Exogenous Variables (SARIMAX), is created to accurately forecast the temperature. Experimental results show that the WD-SARIMAX model performs better than other recent models for temperature forecasting in Delhi.

SUSTAINABILITY (2023)

Article Chemistry, Analytical

Bayesian Optimization with Support Vector Machine Model for Parkinson Disease Classification

Ahmed M. Elshewey, Mahmoud Y. Shams, Nora El-Rashidy, Abdelghafar M. Elhady, Samaa M. Shohieb, Zahraa Tarek

Summary: This paper presents an advanced model called BO-SVM to classify individuals with Parkinson's disease. Bayesian Optimization is used to optimize the hyperparameters of six machine learning models, and the performance of these models is evaluated using various metrics. Experimental results show that the SVM model achieves the highest accuracy of 92.3% after hyperparameter tuning using BO.

SENSORS (2023)

Article Computer Science, Artificial Intelligence

A high-quality feature selection method based on frequent and correlated items for text classification

Heba Mamdouh Farghaly, Tarek Abd El-Hafeez

Summary: The feature selection problem is a significant challenge in pattern recognition, especially for classification tasks. This work explores the use of association analysis in data mining to select meaningful features and proposes a novel feature selection technique for text classification. The technique effectively reduces redundant information while achieving high accuracy using only 6% of the features.

SOFT COMPUTING (2023)

Article Green & Sustainable Science & Technology

Soil Erosion Status Prediction Using a Novel Random Forest Model Optimized by Random Search Method

Zahraa Tarek, Ahmed M. M. Elshewey, Samaa M. M. Shohieb, Abdelghafar M. M. Elhady, Noha E. E. El-Attar, Sherif Elseuofi, Mahmoud Y. Y. Shams

Summary: Soil erosion, the removal of soil particles from the earth's surface, occurs in three phases: dislocation, transport, and deposition. Various factors influence the velocity of soil erosion, including soil type, assembly, infiltration, and land cover. This paper proposes a model, RS-RF, which combines random search optimization with the Random Forest algorithm, for predicting soil erosion. The RS-RF model achieved the best outcomes compared to other machine learning techniques, with an accuracy rate of 97.4% using a dataset of 236 instances and 11 features.

SUSTAINABILITY (2023)

Article Computer Science, Interdisciplinary Applications

Comparative performance of ensemble machine learning for Arabic cyberbullying and offensive language detection

Marwa Khairy, Tarek M. M. Mahmoud, Ahmed Omar, Tarek Abd El-Hafeez

Summary: Research on abusive language and its detection has gained attention due to the impact of cyberbullying on individuals and society. The widespread accessibility of social media sites has led to a substantial increase in hate speech, bullying, and other forms of abuse. This study aimed to automate the detection of offensive language and cyberbullying, using a new Arabic balanced data set and comparing single and ensemble machine learning algorithms. The results showed that ensemble machine learning outperformed single learner classifiers, with the voting ensemble model achieving the best performance. Further improvement was achieved through hyperparameter tuning on an Arabic cyberbullying data set.

LANGUAGE RESOURCES AND EVALUATION (2023)

Article Computer Science, Information Systems

Water quality prediction using machine learning models based on grid search method

Mahmoud Y. Shams, Ahmed M. Elshewey, El-Sayed M. El-kenawy, Abdelhameed Ibrahim, Fatma M. Talaat, Zahraa Tarek

Summary: This study utilizes machine learning models to predict water quality index and water quality classification, and improves the accuracy through parameter optimization and tuning. The experimental results show that the GB model performs the best in classification with an accuracy of 99.50%. In regression, the MLP regressor model outperforms other models with a determination coefficient of 99.8%.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Review Computer Science, Artificial Intelligence

The effect of rebalancing techniques on the classification performance in cyberbullying datasets

Marwa Khairy, Tarek M. Mahmoud, Tarek Abd-El-Hafeez

Summary: This paper investigates the effectiveness of techniques for addressing class imbalance in cyberbullying datasets. Through experimental study, it is found that the performance of resampling techniques depends on the dataset size, imbalance ratio, and classifier used.

NEURAL COMPUTING & APPLICATIONS (2023)

Article Chemistry, Analytical

Innovative Hybrid Approach for Masked Face Recognition Using Pretrained Mask Detection and Segmentation, Robust PCA, and KNN Classifier

Mohammed Eman, Tarek M. Mahmoud, Mostafa M. Ibrahim, Tarek Abd El-Hafeez

Summary: This paper proposes a novel method for masked face recognition, which combines deep-learning-based mask detection, landmark and oval face detection, and robust principal component analysis (RPCA). Experimental results show that our proposed method outperforms existing methods in terms of accuracy and robustness to occlusion. The proposed method achieves a recognition rate of 97%, which is significantly higher than the state-of-the-art methods. Our proposed method represents a significant improvement over existing methods for masked face recognition, providing high accuracy and robustness to occlusion.

SENSORS (2023)

Article Multidisciplinary Sciences

Quantum computing and machine learning for Arabic language sentiment classification in social media

Ahmed Omar, Tarek Abd El-Hafeez

Summary: This paper presents a comparative study of quantum computing and machine learning for Arabic language document classification. The results show that quantum computing slightly outperforms classic machine learning in sentiment analysis of Arabic tweets and has faster processing times for larger datasets. Additionally, classic machine learning achieves higher accuracy when dealing with smaller datasets.

SCIENTIFIC REPORTS (2023)

Article Multidisciplinary Sciences

Predicting female pelvic tilt and lumbar angle using machine learning in case of urinary incontinence and sexual dysfunction

Doaa A. Abdel Hady, Tarek Abd El-Hafeez

Summary: Urinary Incontinence (UI) is defined as uncontrolled urine leakage and is an indication of pelvic floor dysfunction. This study aimed to predict core muscle activity in multiparous women with FSD using machine learning models instead of relying on ultrasound imaging. The developed model showed high accuracy in predicting pelvic tilt and lumbar angle, potentially revolutionizing the assessment and management of this condition.

SCIENTIFIC REPORTS (2023)

Article Computer Science, Theory & Methods

Optimizing classification efficiency with machine learning techniques for pattern matching

Belal A. Hamed, Osman Ali Sadek Ibrahim, Tarek Abd El-Hafeez

Summary: The study proposes a novel model that combines machine learning methods and pattern-matching algorithm for DNA sequence classification. The model aims to improve the accuracy and efficiency of DNA sequence classification by effectively categorizing DNA sequences based on their features. The results show that the SVM linear classifier achieves the highest accuracy and F1 score among the tested algorithms, suggesting that the proposed model outperforms other algorithms in DNA sequence classification. The study also explores the impact of pattern length on the accuracy and time complexity, finding that the execution time of each algorithm varies with the pattern length.

JOURNAL OF BIG DATA (2023)

Article Computer Science, Information Systems

Innovative Feature Selection Method Based on Hybrid Sine Cosine and Dipper Throated Optimization Algorithms

Abdelaziz A. Abdelhamid, El-Sayed M. El-Kenawy, Abdelhameed Ibrahim, Marwa Metwally Eid, Doaa Sami Khafaga, Amel Ali Alhussan, Seyedali Mirjalili, Nima Khodadadi, Wei Hong Lim, Mahmoud Y. Shams

Summary: Feature selection is a crucial task in pattern recognition and data mining. This study proposes a novel hybrid binary meta-heuristic algorithm, bSCWDTO, to solve the feature selection problem. The algorithm outperforms ten state-of-the-art optimization methods and is statistically different from alternative feature selection methods.

IEEE ACCESS (2023)

暂无数据