4.6 Article

A Hybrid Swarm and Gravitation-based feature selection algorithm for handwritten Indic script classification problem

期刊

COMPLEX & INTELLIGENT SYSTEMS
卷 7, 期 2, 页码 823-839

出版社

SPRINGER HEIDELBERG
DOI: 10.1007/s40747-020-00237-1

关键词

Feature selection; Hybrid Swarm and Gravitation-based Feature Selection; Particle swarm optimization; Gravitational search algorithm; Handwritten script classification; Indic script

向作者/读者索取更多资源

In this study, a new feature selection algorithm, HSGFS, is introduced to reduce dimensionality and improve accuracy of handwritten script classification. Experimental results demonstrate an average improvement in classification accuracy of 2-5% when using 75-80% of the original feature vectors. The proposed method also outperforms some popular FS models in terms of performance.
In any multi-script environment, handwritten script classification is an unavoidable pre-requisite before the document images are fed to their respective Optical Character Recognition (OCR) engines. Over the years, this complex pattern classification problem has been solved by researchers proposing various feature vectors mostly having large dimensions, thereby increasing the computation complexity of the whole classification model. Feature Selection (FS) can serve as an intermediate step to reduce the size of the feature vectors by restricting them only to the essential and relevant features. In the present work, we have addressed this issue by introducing a new FS algorithm, called Hybrid Swarm and Gravitation-based FS (HSGFS). This algorithm has been applied over three feature vectors introduced in the literature recently-Distance-Hough Transform (DHT), Histogram of Oriented Gradients (HOG), and Modified log-Gabor (MLG) filter Transform. Three state-of-the-art classifiers, namely, Multi-Layer Perceptron (MLP), K-Nearest Neighbour (KNN), and Support Vector Machine (SVM), are used to evaluate the optimal subset of features generated by the proposed FS model. Handwritten datasets at block, text line, and word level, consisting of officially recognized 12 Indic scripts, are prepared for experimentation. An average improvement in the range of 2-5% is achieved in the classification accuracy by utilizing only about 75-80% of the original feature vectors on all three datasets. The proposed method also shows better performance when compared to some popularly used FS models. The codes used for implementing HSGFS can be found in the following Github link: https://github.com/Ritam-Guha/HSGFS.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Information Systems

A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm

Soham Chattopadhyay, Arijit Dey, Pawan Kumar Singh, Ali Ahmadian, Ram Sarkar

Summary: Speech is crucial in human communication and human-computer interaction. In the field of AI and ML, it has been extensively studied to recognize human emotions from speech signals. To address the challenge of large feature dimension, a hybrid feature selection algorithm called CEOAS is proposed. By extracting LPC and LPCC features, the proposed model reduces feature dimension and improves classification accuracy. Impressive recognition accuracies have been achieved on four benchmark datasets, surpassing state-of-the-art algorithms.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Article Computer Science, Information Systems

Automatic spoken language identification using MFCC based time series features

Mainak Biswas, Saif Rahaman, Ali Ahmadian, Kamalularifin Subari, Pawan Kumar Singh

Summary: Spoken Language Identification (SLID) is a well-researched field and an important first step in multilingual speech recognition systems. This study proposes a model for Indian and foreign language recognition, which enhances data to make it robust against everyday life noise and selects relevant features through feature extraction and selection algorithms. The model achieves high accuracy on three standard datasets, indicating that these features capture language specific characteristics of speech and can be used as standard features for SLID task.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Article Business, Finance

Impact of pandemic on development and demography in different continents and nations

Pawan Kumar Singh, Alok Kumar Pandey, Ravi Kiran, Rajiv Kumar Bhatt, Anushka Chouhan

Summary: This study collected information from 145 countries to predict the impact of COVID-19 cases, tests per million, and the proportion of people aged 65 and above on deaths per million at country and continent levels. It also evaluated the economic cost of these indicators in terms of reduction in GDP growth rate. The study found significant differences across continents and a negative association between tests per million and deaths per million. It provides valuable insights for assessing the impact of these indicators in the pandemic and informing policy formation and decision-making strategies.

INTERNATIONAL JOURNAL OF FINANCE & ECONOMICS (2023)

Article Environmental Sciences

Forecasting of non-renewable and renewable energy production in India using optimized discrete grey model

Alok Kumar Pandey, Pawan Kumar Singh, Muhammad Nawaz, Amrendra Kumar Kushwaha

Summary: Renewable energy plays an important role in providing reliable power supplies and diversifying fuel sources, while also helping to conserve natural resources. Solar energy has become increasingly prominent in India. This study forecasts the development of renewable energy and finds that wind power is growing faster than hydropower, solar energy, and bioenergy.

ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH (2023)

Article Computer Science, Artificial Intelligence

A new population initialization approach based on Metropolis-Hastings (MH) method

Erik Cuevas, Hector Escobar, Ram Sarkar, Heba F. Eid

Summary: This paper proposes a new population initialization method for metaheuristic algorithms, where the initial set of candidate solutions is obtained through the sampling of the objective function. The method aims to find initial solutions that are close to the prominent values of the objective function, and these initial points represent promising regions of the search space. The proposed approach shows faster convergence and improved quality of solutions compared to other similar approaches.

APPLIED INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Breast cancer detection in thermograms using a hybrid of GA and GWO based deep feature selection method

Rishav Pramanik, Payel Pramanik, Ram Sarkar

Summary: Breast cancer is a leading cause of premature death among women globally, but early detection and diagnosis can save lives. Hence, computer scientists are working to develop reliable models to tackle this disease. A proposed lightweight model combines transfer learning-based deep learning (DL) with feature selection to detect abnormalities in breast thermograms. This model performs well in detecting and differentiating malignant and healthy breasts.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Automation & Control Systems

A modified GNN architecture with enhanced aggregator and Message Passing Functions

Debjit Sarkar, Sourodeep Roy, Samir Malakar, Ram Sarkar

Summary: Graph neural networks (GNN) maintain the essence of irregularly structured information in a graph through message passing and feature aggregation. A weighting scheme called VecGNN is proposed to incorporate inter-node feature-level correlational information, considering the relative position of nodes in the feature space. VecGNN outperforms baseline models GCN, GAT, and JKNets by 2%-4% on citation datasets.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Gamma function based ensemble of CNN models for breast cancer detection in histopathology images

Samriddha Majumdar, Payel Pramanik, Ram Sarkar

Summary: Breast cancer is the second deadliest disease among women globally. Histopathology image analysis is an effective method for detecting tumor malignancies. Computer-aided diagnosis (CAD) using convolutional neural network (CNN) models has shown potential in breast histopathological image classification, but there is room for improvement. This paper proposes a novel rank-based ensemble method that combines multiple CNN models to enhance classification accuracy.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Engineering, Manufacturing

Virtual metrology in long batch processes using machine learning

Ritam Guha, Anirudh Suresh, Jared DeFrain, Kalyanmoy Deb

Summary: A long batch process involves recording sensor data with complicated non-linear dynamics, which are difficult to model. Predicting process outcomes before completion is important, and Virtual Metrology (VM) has been proposed as a solution. This paper introduces a generalized VM pipeline with a deep-learning model that can handle high-dimensional input sensors and outputs. The model can predict industrial process outcomes with less than 10% error after about one-fifth of the total process-time.

MATERIALS AND MANUFACTURING PROCESSES (2023)

Article Computer Science, Information Systems

An ensemble approach to detect copy-move forgery in videos

S. k Mohiuddin, Samir Malakar, Ram Sarkar

Summary: Video forgery has become more common due to the easy availability of tools. This study proposes an ensemble based method to detect duplicate frames in a video. By extracting different types of features and applying lexicographical sorting, the method achieves high detection accuracy and outperforms state-of-the-art methods.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Article Computer Science, Information Systems

A comprehensive survey on state-of-the-art video forgery detection techniques

Sk Mohiuddin, Samir Malakar, Munish Kumar, Ram Sarkar

Summary: Video plays a critical role in conveying authenticity in various fields such as surveillance, medicine, journalism, and social media. However, the trust in videos is diminishing due to the ease of video forgery using accessible editing tools. This article comprehensively discusses the initiatives and recent trends in video forgery detection research worldwide.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Article Computer Science, Information Systems

JUVDsi v1: developing and benchmarking a new still image database in Indian scenario for automatic vehicle detection

Avirup Bhattacharyya, Avigyan Bhattacharya, Sourajit Maity, Pawan Kumar Singh, Ram Sarkar

Summary: Designing an automatic vehicle detection system that caters to the requirements of the traffic management system is important. This research develops a still image database, JUVDsi v1, for designing an automated traffic management system in India. The database addresses the shortcomings of existing databases and is evaluated using state-of-the-art deep learning architectures.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Article Computer Science, Interdisciplinary Applications

Discrete equilibrium optimizer combined with simulated annealing for feature selection

Ritam Guha, Kushal Kanti Ghosh, Suman Kumar Bera, Ram Sarkar, Seyedali Mirjalili

Summary: This paper proposes a binary adaptation of Equilibrium Optimizer (EO) called Discrete EO (DEO) for solving binary optimization problems. DEOSA algorithm, combining DEO with Simulated Annealing (SA) as a local search procedure, is applied to various datasets and outperforms other algorithms. The scalability and robustness of DEOSA are also tested on high-dimensional Microarray datasets and Knapsack problems, showing its superiority.

JOURNAL OF COMPUTATIONAL SCIENCE (2023)

Article Mathematics

Identifying Genetic Signatures from Single-Cell RNA Sequencing Data by Matrix Imputation and Reduced Set Gene Clustering

Soumita Seth, Saurav Mallik, Atikul Islam, Tapas Bhadra, Arup Roy, Pawan Kumar Singh, Aimin Li, Zhongming Zhao

Summary: In this paper, a new framework is introduced to discover gene signatures from scRNA-seq data. The framework combines various strategies such as imputed matrix, MRMR feature selection, and shrinkage clustering. The results show that the proposed framework efficiently identifies differentially expressed stronger gene signatures and up-regulated markers in single-cell RNA sequencing data.

MATHEMATICS (2023)

暂无数据