4.7 Article

Indic handwritten script identification using offline-online multi-modal deep network

Journal

INFORMATION FUSION
Volume 57, Issue -, Pages 1-14

Publisher

ELSEVIER
DOI: 10.1016/j.inffus.2019.10.010

Keywords

Handwritten script identification; Deep neural network; Multi-modal learning; Offline and online handwriting; Character level training

Ask authors/readers for more resources

In this paper, we propose a novel approach of word-level Indic script identification using only character-level data in training stage. Our method uses a mull-modal deep network which takes both offline and online modality of the data as input in order to explore the information from both the modalities jointly for script identification task. We take handwritten data in either modality as input and the opposite modality is generated through intermodality conversion. Thereafter, we feed this offline-online modality pair to our network. Hence, along with the advantage of utilizing information from both the modalities, the proposed framework can work for both offline and online script identification which alleviates the need for designing two separate script identification modules for individual modality. We also propose a novel conditional mull-modal fusion scheme to combine the information from offline and online modally which takes into account the original modally of the data being fed to our network and thus it combines adaptively. An exhaustive experimental study has been done on a data set including English(Roman) and 6 other official Indic scripts. Our proposed scheme outperforms traditional classifiers along with handcrafted features and deep learning based methods. Experiment results show that using only character level training data can achieve competitive performance against traditional training using word level data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Engineering, Civil

Vehicular Trajectory Classification and Traffic Anomaly Detection in Videos Using a Hybrid CNN-VAE Architecture

Kelathodi Kumaran Santhosh, Debi Prosad Dogra, Partha Pratim Roy, Adway Mitra

Summary: The study proposes a method for trajectory classification and anomaly detection using a hybrid CNN-VAE architecture. By introducing high-level features and semi-supervised class labeling, the accuracy of trajectory classification and anomaly detection is significantly improved.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2022)

Article Engineering, Electrical & Electronic

Robust Scene Text Detection for Partially Annotated Training Data

Prateek Keserwani, Rajkumar Saini, Marcus Liwicki, Partha Pratim Roy

Summary: This article analyzes the impact of training data with partial annotations on scene text detection and proposes a text region refinement approach to address it. The proposed method refines text regions by generating pseudo-labels, resulting in significant improvement over existing approaches for partially annotated training data.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)

Article Computer Science, Artificial Intelligence

EEG-Based Cognitive State Classification and Analysis of Brain Dynamics Using Deep Ensemble Model and Graphical Brain Network

Debashis Das Chakladar, Partha Pratim Roy, Masakazu Iwamura

Summary: The article introduces a method to study the cognitive state of operators using EEG signals and proposes a deep ensemble model for classifying the cognitive state. Additionally, an algorithm is proposed to identify the brain dynamics for each cognitive state and analyze the connectivity between different brain regions.

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS (2022)

Article Automation & Control Systems

Crowd Characterization in Surveillance Videos Using Deep-Graph Convolutional Neural Network

Shreetam Behera, Debi Prosad Dogra, Malay Kumar Bandyopadhyay, Partha Pratim Roy

Summary: Crowd behavior is a natural phenomenon and modeling the visual appearance of a large crowd can provide valuable insights into its dynamics. In this article, a graph classification framework is proposed for crowd characterization using a deep graph convolutional neural network. Experimental results show significant improvements in accuracy and area under the curve (AUC) compared to existing frameworks.

IEEE TRANSACTIONS ON CYBERNETICS (2023)

Article Computer Science, Artificial Intelligence

Cognitive Workload Estimation Using Variational Autoencoder and Attention-Based Deep Model

Debashis Das Chakladar, Sumalyo Datta, Partha Pratim Roy, Vinod A. Prasad

Summary: An effective VAE-CBAM-based deep model is proposed in this article for estimating cognitive states. The model extracts noise-free robust features from the latent space using VAE and improves the spatial resolution of EEG signals using CBAM. Experimental results show that the proposed model achieves good classification accuracy under different cognitive task conditions.

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Sketch-Segformer: Transformer-Based Segmentation for Figurative and Creative Sketches

Yixiao Zheng, Jiyang Xie, Aneeshan Sain, Yi-Zhe Song, Zhanyu Ma

Summary: This paper proposes the Sketch-Segformer framework for sketch semantic segmentation, which effectively utilizes multi-facet information of sketches to achieve state-of-the-art performance. By treating sketches as stroke sequences and incorporating order embedding and spatial embeddings, Sketch-Segformer demonstrates superior segmentation accuracy.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Summary: This paper presents a method to transform abstract, deformed sketches into photorealistic images without the need for an edgemap-like sketch. The researchers propose a decoupled encoder-decoder training paradigm and use an autoregressive sketch mapper to bridge the abstraction gap between sketch and photo. The generated results outperform state-of-the-art methods in fine-grained sketch-based image retrieval tasks.

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR (2023)

Article Engineering, Biomedical

Efficacy of transformer networks for classification of EEG data

Gourav Siddhad, Anmol Gupta, Debi Prosad Dogra, Partha Pratim Roy

Summary: This paper explores the effectiveness of using transformer networks to classify EEG data. The performance was evaluated on both local and public datasets, and the results show that the transformer networks achieved comparable accuracy to state-of-the-art methods without the need for feature extraction.

BIOMEDICAL SIGNAL PROCESSING AND CONTROL (2024)

Proceedings Paper Computer Science, Artificial Intelligence

ENDE-GNN: An Encoder-decoder GNN Framework for Sketch Semantic Segmentation

Yixiao Zheng, Jiyang Xie, Aneeshan Sain, Zhanyu Ma, Yi-Zhe Song, Jun Guo

Summary: In this study, an encoder-decoder GNN framework called ENDE-GNN is proposed to improve the performance of sketch semantic segmentation. ENDE-GNN extracts both inter-stroke and intra-stroke features and pays attention to the drawing order of sketches. Experimental results demonstrate that ENDE-GNN achieves state-of-the-art performance on multiple sketch semantic segmentation datasets.

2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Adaptive Fine-Grained Sketch-Based Image Retrieval

Ayan Kumar Bhunia, Aneeshan Sain, Parth Hiren Shah, Animesh Gupta, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Summary: This paper focuses on the generalization of fine-grained sketch-based image retrieval (FG-SBIR) and proposes a model-agnostic meta-learning framework that adapts quickly with few samples. The proposed framework includes key modifications to improve the stability and effectiveness of the model, and experiments show significant improvements over existing approaches.

COMPUTER VISION, ECCV 2022, PT XXXVII (2022)

Proceedings Paper Computer Science, Artificial Intelligence

FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context

Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song

Summary: In this paper, we advance sketch research using the first dataset of freehand scene sketches, FS-COCO. We collect 10,000 freehand scene vector sketches from 100 non-expert individuals, accompanied by text descriptions. This dataset allows us to study fine-grained image retrieval from freehand scene sketches and sketch captions, and explore insights on scene salience, performance comparison, and complementarity of information.

COMPUTER VISION, ECCV 2022, PT VIII (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Sketch3T: Test-Time Training for Zero-Shot SBIR

Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Summary: This paper extends the zero-shot sketch-based image retrieval method to adapt to both categories and sketch distributions. A test-time training paradigm is proposed, along with the use of a self-supervised auxiliary task for sketches without paired photos. Extensive experiments demonstrate that the proposed method not only transfers to new categories but also accommodates to new sketching styles.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Sketching without Worrying: Noise-Tolerant Sketch-Based Image Retrieval

Ayan Kumar Bhunia, Subhadeep Koley, Abdullah Faiz Ur Rahman Khilji, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Summary: This paper proposes an auxiliary module that allows users to sketch without worries in image retrieval. Through detecting noisy strokes and selecting the ones that contribute positively to retrieval, the module achieves significant performance improvement when combined with pre-trained retrieval models. Furthermore, the trained selector can be used in various sketch applications in a plug-and-play manner.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Doodle It Yourself: Class Incremental Learning by Drawing a Few Sketches

Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Subhadeep Koley, Rohit Kundu, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

Summary: This paper introduces a novel FSCIL framework that utilizes sketches as a modality for class support, addressing the challenges of learning from diverse modalities and limited accessibility to photos.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Partially Does It: Towards Scene-Level FG-SBIR with Partial Input

Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Viswanatha Reddy Gajjala, Aneeshan Sain, Tao Xiang, Yi-Zhe Song

Summary: This article presents a method to address the issue of partial scene sketches by using an optimal transport model to model cross-modal region associativity in a partially-aware manner, and further improving upon it to consider holistic partialness. The proposed method is robust to partial scene sketches and achieves state-of-the-art performance on existing datasets.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Article Computer Science, Artificial Intelligence

Ultrametrics for context-aware comparison of binary images

C. Lopez-Molina, S. Iglesias-Rey, B. De Baets

Summary: Quantitative image comparison is a critical topic in image processing literature, with diverse applications. Existing measures of comparison often overlook the context in which the comparison takes place. This paper presents a context-aware comparison method for binary images, tested on the BSDS500 benchmark.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Preemptively pruning Clever-Hans strategies in deep neural networks

Lorenz Linhardt, Klaus-Robert Mueller, Gregoire Montavon

Summary: This paper investigates the issue of mismatches between the decision strategy of the explainable model and the user's domain knowledge, and proposes a new method EGEM to mitigate hidden flaws in the model. Experimental results demonstrate that the approach can significantly reduce reliance on Clever Hans strategies and improve the accuracy of the model on new data.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Dual-level Deep Evidential Fusion: Integrating multimodal information for enhanced reliable decision-making in deep learning

Zhimin Shao, Weibei Dou, Yu Pan

Summary: This paper proposes a novel algorithm, Dual-level Deep Evidential Fusion (DDEF), to integrate multimodal information at both the BBA level and multimodal level, aiming to enhance accuracy, robustness, and reliability. The DDEF approach utilizes the Dirichlet framework and BBA methods for effective uncertainty estimation and employs the Dempster-Shafer Theory for dual-level fusion. The experimental results show that the proposed DDEF outperforms existing methods in synthetic digit classification and real-world medical prognosis after BCI treatment.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Multi-modal detection of fetal movements using a wearable monitor

Abhishek K. Ghosh, Danilo S. Catelli, Samuel Wilson, Niamh C. Nowlan, Ravi Vaidyanathan

Summary: The inability of current FM monitoring methods to be used outside clinical environments has made it challenging to understand the nature and evolution of FM. This investigation introduces a novel wearable FM monitor with a heterogeneous sensor suite and a data fusion architecture to efficiently capture and separate FM from interfering artifacts. The performance of the device and architecture were validated through at-home use, demonstrating high accuracy in detecting and recognizing FM events. This research is a major milestone in the development of low-cost wearable FM monitors for pervasive monitoring of FM in unsupervised environments.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

ADCT-Net: Adaptive traffic forecasting neural network via dual-graphic cross-fused transformer

Jianlei Kong, Xiaomeng Fan, Min Zuo, Muhammet Deveci, Xuebo Jin, Kaiyang Zhong

Summary: In this study, we propose an intelligent traffic flow prediction framework based on the adaptive dual-graphic transformer with a cross-fusion strategy, aiming to uncover latent graphic feature representations that transcend temporal and spatial limitations. By establishing a traffic spatiotemporal prediction model using a cross-fusion attention mechanism, our proposed model achieves superior prediction performance on practical urban traffic flow datasets, particularly for long-term predictions.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Component similarity based conflict analysis: An information fusion viewpoint

Huilai Zhi, Jinhai Li

Summary: This article addresses the issue that conflict analysis based on single-valued information systems is no longer valid. It proposes a conflict analysis method based on component similarity, which uses three-way n-valued concept lattices to handle set-valued formal contexts and realizes fast conflict analysis from an information fusion viewpoint. Experimental results verify the effectiveness of this method in reducing time consumption.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Mining and fusing unstructured online reviews and structured public index data for hospital selection

Huchang Liao, Jiaxin Qi, Jiawei Zhang, Chonghui Zhang, Fan Liu, Weiping Ding

Summary: In this paper, a hospital selection approach based on a fuzzy multi-criterion decision-making method is proposed. This approach considers sentiment evaluation values of unstructured data from online reviews and structured data of public indexes simultaneously. The methodology involves collecting and processing online reviews, classifying topics and sentiments, quantifying sentiment analysis results using fuzzy numbers, and obtaining final preference scores of hospitals based on patients' preferences. A case study and robustness analysis are conducted to validate the effectiveness of the method.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Fake news detection: Taxonomy and comparative study

Faramarz Farhangian, Rafael M. O. Cruz, George D. C. Cavalcanti

Summary: The proliferation of social networks has posed challenges in combating fake news, but automatic fake news detection using artificial intelligence has become more feasible. This paper revisits the definitions and perspectives of fake news and proposes an updated taxonomy, based on multiple criteria, for the field. The study finds that optimal feature extraction techniques vary depending on the dataset, and context-dependent models based on transformer models consistently exhibit superior performance.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

A dynamic multiple classifier system using graph neural network for high dimensional overlapped data

Mariana A. Souza, Robert Sabourin, George D. C. Cavalcanti, Rafael M. O. Cruz

Summary: In this study, a dynamic selection technique is proposed to handle sparse and overlapped data. The technique leverages the relationships between instances and classifiers to learn a dynamic classifier combination rule. Experimental results show that the proposed method outperforms static selection and other dynamic selection techniques.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

A clustering method based on multi-positive-negative granularity and attenuation-diffusion pattern

Bin Yu, Ruihui Xu, Mingjie Cai, Weiping Ding

Summary: This paper introduces a clustering method based on non-Euclidean metric and multi-granularity staged clustering to address the challenges posed by complex spatial structure data to traditional clustering methods. The method improves the similarity measure and employs an attenuation-diffusion pattern for local to global clustering, achieving good clustering results.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Fast metric multi-view hashing for multimedia retrieval

Jian Zhu, Pengbo Hu, Bingqian Li, Yi Zhou

Summary: The acquisition of multi-view hash representation for heterogeneous data is highly important for multimedia retrieval. Current approaches suffer from limited retrieval precision due to insufficient integration of multi-view features and failure to effectively utilize metric information from diverse samples. In this paper, we propose an innovative method called Fast Metric Multi-View Hashing (FMMVH), which demonstrates the superiority of gate-based fusion over traditional methods. We also introduce a novel deep metric loss function to leverage metric information from dissimilar samples. By optimizing and employing model compression techniques, our FMMVH method significantly outperforms existing state-of-the-art methods on benchmark datasets, with up to 7.47% improvement in mean Average Precision (mAP).

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

SwinWave-SR: Multi-scale lightweight underwater image super-resolution

Fayaz Ali Dharejo, Iyyakutti Iyappan Ganapathi, Muhammad Zawish, Basit Alawode, Moath Alathbah, Naoufel Werghi, Sajid Javed

Summary: The resource-limited nature of underwater vision equipment affects underwater robotics and ocean engineering tasks. Super-resolution methods, particularly using Vision Transformers (ViTs), have emerged to enhance low-resolution underwater images. However, ViTs face challenges in handling severe degradation in underwater imaging. In contrast, Multi-scale ViTs (MViTs) overcome these challenges by preserving long-range dependencies through evolving channel capacity. This study proposes a novel algorithm, SwinWave-SR, for efficient and accurate multi-scale super-resolution for underwater images.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Federated split learning for sequential data in satellite-terrestrial integrated networks

Weiwei Jiang, Haoyu Han, Yang Zhang, Jianbin Mu

Summary: This study incorporates federated learning and split learning paradigms with satellite-terrestrial integrated networks and introduces a split-then-federated learning framework and federated split learning with long short-term memory to handle sequential data in STINs. The proposed solution is demonstrated to be effective through a case study of electricity theft detection based on a real-world dataset.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Software defined radio frequency sensing framework for Internet of Medical Things

Najah Abuali, Mohammad Bilal Khan, Farman Ullah, Mohammad Hayajneh, Hikmat Ullah, Shahid Mumtaz

Summary: The demand for innovative solutions in biomedical systems for precise diagnosis and management of critical diseases is increasing. A promising technology, non-invasive and intelligent Internet of Medical Things (IoMT) system, emerges to assess patients with reduced health risks. This research introduces a comprehensive framework for early diagnosis of respiratory abnormalities through RF sensing and SDR technology. The results highlight the superior performance of deep learning frameworks in classifying respiratory anomalies.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Global-local fusion based on adversarial sample generation for image-text matching

Shichen Huang, Weina Fu, Zhaoyue Zhang, Shuai Liu

Summary: In the era of adversarial machine learning (AML), developing robust and generalized algorithms has become a key research topic. This study proposes a global similarity matching module and a global-local cognition fusion training mechanism based on relationship adversarial sample generation to improve image-text matching algorithm. Experimental results show significant improvements in accuracy and robustness, performing well in facing security challenges and promoting the fusion of visual and linguistic modalities.

INFORMATION FUSION (2024)