4.4 Article

Hybrid supervised clustering based ensemble scheme for text classification

期刊

KYBERNETES
卷 46, 期 2, 页码 330-348

出版社

EMERALD GROUP PUBLISHING LTD
DOI: 10.1108/K-10-2016-0300

关键词

Diversity; Text classification; Classifier ensemble; Supervised clustering

向作者/读者索取更多资源

Purpose - The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in information retrieval, such as document organization, text filtering and sentiment analysis. Ensemble learning has been extensively studied to construct efficient text classification schemes with higher predictive performance and generalization ability. The purpose of this paper is to provide diversity among the classification algorithms of ensemble, which is a key issue in the ensemble design. Design/methodology/approach - An ensemble scheme based on hybrid supervised clustering is presented for text classification. In the presented scheme, supervised hybrid clustering, which is based on cuckoo search algorithm and k-means, is introduced to partition the data samples of each class into clusters so that training subsets with higher diversities can be provided. Each classifier is trained on the diversified training subsets and the predictions of individual classifiers are combined by the majority voting rule. The predictive performance of the proposed classifier ensemble is compared to conventional classification algorithms (such as Naive Bayes, logistic regression, support vector machines and C4.5 algorithm) and ensemble learning methods (such as AdaBoost, bagging and random subspace) using 11 text benchmarks. Findings - The experimental results indicate that the presented classifier ensemble outperforms the conventional classification algorithms and ensemble learning methods for text classification. Originality/value - The presented ensemble scheme is the first to use supervised clustering to obtain diverse ensemble for text classification

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Interdisciplinary Applications

Sentiment analysis on massive open online course evaluations: A text mining and deep learning approach

Aytug Onan

Summary: The study evaluated the predictive performance of conventional supervised learning methods, ensemble learning methods, and deep learning methods, as well as the efficiency of text representation and word-embedding schemes in sentiment analysis on MOOC evaluations. Analysis of a corpus containing 66,000 MOOC reviews indicated that deep learning-based architectures outperformed other methods for sentiment analysis on educational data mining. The highest predictive performance was achieved by long short-term memory networks combined with GloVe word-embedding scheme-based representation, with a classification accuracy of 95.80%.

COMPUTER APPLICATIONS IN ENGINEERING EDUCATION (2021)

Article Computer Science, Interdisciplinary Applications

Weighted word embeddings and clustering-based identification of question topics in MOOC discussion forum posts

Aytug Onan, Mansur Alp Tocoglu

Summary: The study aims to use weighted word embeddings and clustering techniques to cluster MOOC discussion forum posts and identify question topics. By evaluating four word-embedding schemes, four weighting functions, and four clustering algorithms, it is found that weighted word-embedding schemes combined with clustering algorithms outperform conventional schemes.

COMPUTER APPLICATIONS IN ENGINEERING EDUCATION (2021)

Article Computer Science, Software Engineering

Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks

Aytug Onan

Summary: A deep learning architecture combining TF-IDF-weighted Glove word embedding with CNN-LSTM architecture outperforms conventional deep learning methods in sentiment analysis of product reviews from Twitter.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE (2021)

Article Engineering, Multidisciplinary

An improved ant-based algorithm based on heaps merging and fuzzy c-means for clustering cancer gene expression data

Hasan Bulut, Aytug Onan, Serdar Korukoglu

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES (2020)

Article Computer Science, Information Systems

Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification

Aytug Onan

Summary: This paper proposes a bidirectional convolutional recurrent neural network architecture for sentiment analysis, which utilizes bidirectional LSTM and GRU layers to extract past and future contexts, and employs a group-wise enhancement mechanism to strengthen important features and weaken less important ones. Experimental results demonstrate that this architecture achieves state-of-the-art performance in sentiment analysis.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2022)

Article Computer Science, Artificial Intelligence

Bi-Directional CNN-RNN Architecture with Group-Wise Enhancement and Attention Mechanisms for Cryptocurrency Sentiment Analysis

Gul Cihan Habek, Mansur Alp Tocoglu, Aytug Onan

Summary: With the growth of the cryptocurrency trading market, sentiment analysis of cryptocurrency comments has become crucial. A novel deep neural network architecture was introduced for sentiment classification, showing an accuracy of 93.77% in experimental results.

APPLIED ARTIFICIAL INTELLIGENCE (2022)

Article Engineering, Electrical & Electronic

DeepChestNet: Artificial intelligence approach for COVID-19 detection on computed tomography images

Mahmut Agrali, Volkan Kilic, Aytug Onan, Esra Meltem Koc, Ali Murat Koc, Rasit Eren Buyuktoka, Turker Acar, Zehra Adibelli

Summary: The conventional approach for identifying GGO in medical imaging is CNN, which shows promising performance in COVID-19 detection. However, CNN has limitations in capturing the structured relationships of GGO. This paper proposes a novel framework called DeepChestNet that leverages structured relationships by performing segmentation and classification on lung, pulmonary lobe, and GGO, leading to enhanced detection and diagnosis of COVID-19.

INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY (2023)

Article Computer Science, Information Systems

Hierarchical graph-based text classification framework with contextual node embedding and BERT-based dynamic fusion

Aytug Onan

Summary: We propose a novel hierarchical graph-based text classification framework that leverages contextual node embedding and BERT-based dynamic fusion to capture complex relationships between nodes in the hierarchy. The framework consists of seven stages: Linguistic Feature Extraction, Hierarchical Node Construction, Contextual Node Embedding, Multi-Level Graph Learning, Dynamic Text Sequential Feature Interaction, Attention-Based Graph Learning, and Dynamic Fusion with BERT. Experimental results on benchmark datasets demonstrate significant improvements in classification accuracy compared to state-of-the-art methods.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2023)

Article Computer Science, Information Systems

SRL-ACO: A text augmentation framework based on semantic role labeling and ant colony optimization

Aytug Onan

Summary: The process of creating high-quality labeled data is crucial but time-consuming. This paper proposes a text augmentation framework called SRL-ACO that leverages Semantic Role Labeling and Ant Colony Optimization techniques to enhance the accuracy of natural language processing models without requiring manual data annotation.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2023)

Proceedings Paper Acoustics

Automated Image Captioning with Multi-layer Gated Recurrent Unit

Ozge Taylan Moral, Volkan Kilic, Aytug Onan, Wenwu Wang

Summary: Describing the semantic content of an image through natural language has attracted significant interest in computer vision and language processing. Existing image captioning approaches have limitations in generating accurate captions due to their inability to effectively use visual information. This paper proposes an improved method using multi-layer GRU to enhance the semantic coherence of captions, and experimental results demonstrate its superiority.

2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022) (2022)

Proceedings Paper Acoustics

Auxiliary Classifier based Residual RNN for Image Captioning

Ozkan Cayli, Volkan Kilic, Aytug Onan, Wenwu Wang

Summary: Image captioning is the task of generating descriptive captions for visual content using natural language automatically. Recent advancements in deep neural networks have improved the generation of natural and semantic text in image captioning. However, maintaining gradient flow between neurons in consecutive layers becomes challenging with deeper networks. In this paper, the authors propose integrating an auxiliary classifier into the residual recurrent neural network to enhance caption generation by enabling gradient flow to reach bottom layers. Experiments on MSCOCO and VizWiz datasets demonstrate the superiority of the proposed approach over state-of-the-art methods in multiple performance metrics.

2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022) (2022)

Proceedings Paper Cell & Tissue Engineering

Synchrosqueezing Transform and Non-Negative Matrix Factorization based Feature Extraction from EEG Signals for Motor Imagery Classification

Duygu Degirmenci, Mehmet Akif Ozdemir, Onan Guren, Aytug Onan

Summary: This study aims to improve classification performance of EEG signals for MI tasks by extracting discriminative features with NMF from TFD obtained by WSST, achieving outstanding accuracy, kappa, and F1 score with various classifiers. WSST provides energy distributions with highly localization capability in TFD, offering a promising approach for MI task classification.

2022 MEDICAL TECHNOLOGIES CONGRESS (TIPTEKNO'22) (2022)

Proceedings Paper Engineering, Electrical & Electronic

Multi-GRU Based Automated Image Captioning for Smartphones

Rumeysa Keskin, Ozge Taylan Moral, Volkan Kilic, Aytug Onan

Summary: The study introduces an automatic image captioning system for smartphones, utilizing advanced visual information and decoder structure to generate more meaningful image descriptions. The system performs well on the MSCOCO dataset and is integrated into a custom Android application, IMECA, for offline caption generation.

29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021) (2021)

Article Computer Science, Information Systems

A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification

Aytug Onan, Mansur A. L. P. Tocoglu

Summary: This research aims to present an effective sarcasm identification framework on social media data by utilizing neural language models and deep neural networks. The model includes a three-layer stacked bidirectional long short-term memory architecture and introduces an inverse gravity moment based term weighted word embedding model to preserve word-ordering information. The presented model achieves promising results with a classification accuracy of 95.30% for the sarcasm identification task.

IEEE ACCESS (2021)

Article Computer Science, Artificial Intelligence

Satire identification in Turkish news articles based on ensemble of classifiers

Aytug Onan, Mansur Alp Tocoglu

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES (2020)

暂无数据