4.6 Article

Training CNNs for 3-D Sign Language Recognition With Color Texture Coded Joint Angular Displacement Maps

Journal

IEEE SIGNAL PROCESSING LETTERS
Volume 25, Issue 5, Pages -

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LSP.2018.2817179

Keywords

Convolutional neural network; joint angular displacement map; three-dimensional (3-D) sign language recognition

Funding

  1. research project scheme titled Visual-Verbal Machine Interpreter Fostering Hearing Impaired and Elderly by the Technology Interventions for Disabled and Elderly program of the Department of Science and Technology, SEED Division, Govt. of India, Ministr [SEED/TIDE/013/2014(G)]

Ask authors/readers for more resources

Convolutional neural networks (CNNs) can be remarkably effective for recognizing two-dimensional and three-dimensional (3-D) actions. To further explore the potential of CNNs, we applied them in the recognition of 3-D motion-captured sign language (SL). The sign's 3-D spatio-temporal information of each sign was interpreted using joint angular displacement maps (JADMs), which encode the sign as a color texture image; JADMs were calculated for all joint pairs. Multiple CNN layers then capitalized on the differences between these images and identify discriminative spatio-temporal features. We then compared the performance of our proposed model against those of the state-of-the-art baseline models by using our own 3-D SL dataset and two other benchmark action datasets, namely, HDM05 and CMU.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Information Systems

Multi-view motion modelled deep attention networks (M2DA-Net) for video based sign language recognition

M. Suneetha, M. V. D. Prasad, P. V. V. Kishore

Summary: The study explores video-based sign language recognition using deep learning models, introducing a multi-stream CNN combined with multi-view attention mechanism to address view invariance and achieve improved recognition accuracy.

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION (2021)

Article Computer Science, Information Systems

Early estimation model for 3D-discrete indian sign language recognition using graph matching

E. Kiran Kumar, P. V. V. Kishore, D. Anil Kumar, M. Teja Kiran Kumar

Summary: The study proposes using 3D motion capture technology and graph matching algorithm for machine translation of sign language, addressing two key issues in sign recognition and introducing a two-phase solution.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2021)

Article Computer Science, Information Systems

3D sign language recognition using spatio temporal graph kernels

D. Anil Kumar, A. S. C. S. Sastry, P. V. V. Kishore, E. Kiran Kumar

Summary: 3D sign language recognition is challenging due to the complex spatio-temporal variations of hands and fingers. A twin motion algorithm is proposed to address the variable motion joints, resulting in a method that is signer invariant, motion invariant, and faster compared to state-of-the-art graph kernel methods.

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES (2022)

Article Computer Science, Information Systems

Sharable and unshareable within class multi view deep metric latent feature learning for video-based sign language recognition

M. Suneetha, M. V. D. Prasad, P. V. V. Kishore

Summary: This study introduces a novel model for building a view sensitive environment in multi-view sign language recognition, utilizing metric learning to extract features from multiple views and demonstrating higher accuracy in experiments.

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

Article Computer Science, Artificial Intelligence

Multiview meta-metric learning for sign language recognition using triplet loss embeddings

Suneetha Mopidevi, M. V. D. Prasad, Polurie Venkata Vijay Kishore

Summary: In this paper, a multiview meta-metric learning model is proposed for video-based sign language recognition. Unlike traditional metric learning, this approach is based on set-based distances and utilizes meta-cells and task-based learning. The proposed model also introduces a maximum view pooled distance for binding intra-class views. Experimental results demonstrate that the multiview meta-metric learning model achieves higher accuracies than the baselines on multiview sign language and human action recognition datasets.

PATTERN ANALYSIS AND APPLICATIONS (2023)

Review Computer Science, Theory & Methods

A Short Review on the Role of Various Deep Learning Techniques for Segmenting and Classifying Brain Tumours from MRI Images

Kumari Kavitha, E. Kiran Kumar

Summary: This paper discusses the methods of early identification and segmentation of brain tumors using deep learning techniques, and provides new research and clinical solutions.

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS (2022)

Article Computer Science, Theory & Methods

Deep Multi View Spatio Temporal Spectral Feature Embedding on Skeletal Sign Language Videos for Recognition

Sk Ashraf Ali, M. V. D. Prasad, P. Praveen Kumar, P. V. V. Kishore

Summary: The primary objective of this work is to build a competitive global view from multiple views within a class label. This involves extracting spatio temporal features from videos of skeletal sign language using a 3D convolutional neural network, and ensembling them into a low dimensional subspace. The constructed global view is then utilized as training data for sign language recognition.

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS (2022)

Article Computer Science, Information Systems

Can Skeletal Joint Positional Ordering Influence Action Recognition on Spectrally Graded CNNs: A Perspective on Achieving Joint Order Independent Learning

M. Teja Kiran Kumar, P. V. V. Kishore, B. T. P. Madhav, D. Anil Kumar, N. Sasi Kala, K. Praveen Kumar Rao, B. Prasad

Summary: The research explores the impact of multiple random skeletal joint ordered features on deep learning systems, proposing a novel idea of learning skeletal joint volumetric features on a spectrally graded CNN. The study demonstrates that joint order independent feature learning is achievable on CNNs trained on quantified spatio temporal feature maps extracted from randomly shuffled skeletal joints.

IEEE ACCESS (2021)

No Data Available