Article
Chemistry, Analytical
Jesus Galvan-Ruiz, Carlos M. Travieso-Gonzalez, Alejandro Pinan-Roescher, Jesus B. Alonso-Hernandez
Summary: According to WHO, a significant percentage of the global population faces difficulty in oral communication due to hearing disorders. This article discusses the importance of developing tools to aid in daily communication for these individuals. The research focuses on transcribing Spanish Sign Language (SSL) using a Leap Motion volumetric sensor capable of recognizing hand movements in 3D. By collaborating with a hearing-impaired subject and recording 176 dynamic words, the research achieves an accuracy of 95.17% in predicting input through the use of Dynamic Time Warping (DTW).
Article
Computer Science, Information Systems
Hezhen Hu, Junfu Pu, Wengang Zhou, Houqiang Li
Summary: This article presents a unified framework for multilingual continuous sign language recognition, which improves model performance by sharing a visual encoder and introducing language embeddings. Experimental results show that this method outperforms individually trained models and other state-of-the-art algorithms.
IEEE TRANSACTIONS ON MULTIMEDIA
(2023)
Article
Computer Science, Artificial Intelligence
Gul Varol, Liliane Momeni, Samuel Albanie, Triantafyllos Afouras, Andrew Zisserman
Summary: The focus of this work is sign spotting, which aims to identify whether and where a sign has been signed in a continuous, co-articulated sign language video. This is achieved by training a model using various types of available supervision, such as watching existing footage, reading associated subtitles, and looking up words in visual sign language dictionaries. The effectiveness of the approach is validated on low-shot sign spotting benchmarks. Additionally, a machine-readable British Sign Language (BSL) dictionary dataset called BslDict is provided to facilitate further study of this task.
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2022)
Article
Computer Science, Artificial Intelligence
Yunus Can Bilge, Ramazan Gokberk Cinbis, Nazli Ikizler-Cinbis
Summary: This paper addresses the problem of zero-shot sign language recognition (ZSSLR), aiming to recognize instances of unseen sign classes by leveraging models learned over the seen sign classes. Textual sign descriptions and attributes from sign language dictionaries are used as semantic class representations for knowledge transfer. Three benchmark datasets are introduced to analyze the problem in detail. The proposed approach builds spatiotemporal models of body and hand regions, and shows that textual and attribute based class definitions are effective for recognizing previously unseen sign classes within a zero-shot learning framework. Techniques to analyze the influence of binary attributes in zero-shot predictions are also introduced. The introduced approaches and datasets are expected to facilitate further exploration of zero-shot learning in sign language recognition.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Information Systems
Hao Zhou, Wengang Zhou, Yun Zhou, Houqiang Li
Summary: The research proposes a spatial-temporal multi-cue (STMC) network for video-based sign language understanding, with a spatial multi-cue (SMC) module and a temporal multi-cue (TMC) module. A joint optimization strategy and segmented attention mechanism are designed to make the best of multi-cue sources for sign language recognition and translation, achieving new state-of-the-art performance on three sign language benchmarks.
IEEE TRANSACTIONS ON MULTIMEDIA
(2022)
Article
Computer Science, Information Systems
Hamzah Luqman
Summary: Sign language recognition is an important research area, and this paper proposes a trainable deep learning network that effectively captures the spatiotemporal information of isolated sign language. The network consists of three modules and incorporates techniques to handle variations in sign samples. Experimental results show that this approach outperforms other techniques in recognizing static signs and achieves superior performance on various sign language datasets.
Review
Computer Science, Information Systems
Boban Joksimoski, Eftim Zdravevski, Petre Lameski, Ivan Miguel Pires, Francisco Jose Melero, Tomas Puebla Martinez, Nuno M. Garcia, Martin Mihajlov, Ivan Chorbev, Vladimir Trajkovik
Summary: This paper reviews the technological advancements in sign language recognition, visualization, and synthesis, highlighting the importance of technology developments in image processing and deep learning in driving new applications and tools. Analysis of nearly 2000 papers shows the significant impact of these technological advancements on improving performance metrics in sign language-related tasks.
Article
Chemistry, Analytical
Ilias Papastratis, Christos Chatzikonstantinou, Dimitrios Konstantinidis, Kosmas Dimitropoulos, Petros Daras
Summary: AI technologies are crucial in helping deaf or hearing-impaired people communicate with other communities, facilitating their social inclusion. Recent advancements in technology have enabled the development of various applications to meet the needs of these communities, but challenges still exist. Future research should focus on addressing these challenges to further advance the field.
Article
Engineering, Electrical & Electronic
Anjan Kumar Talukdar, M. K. Bhuyan
Summary: A vision-based continuous SL spotting system is proposed in this study to separate meaningful signs from sign sequences using HMM and Viterbi algorithm, achieving a spotting rate of about 83%.
IEEE SENSORS LETTERS
(2022)
Review
Computer Science, Artificial Intelligence
Razieh Rastgoo, Kourosh Kiani, Sergio Escalera
Summary: Sign language, as a different form of communication language, is crucial for large groups in society, and visual sign language recognition using deep learning approaches has shown significant improvement in recent years. Despite the overall trend indicating increased accuracy, there are still some challenges that need to be addressed in the field of sign language recognition.
EXPERT SYSTEMS WITH APPLICATIONS
(2021)
Article
Computer Science, Artificial Intelligence
Itsaso Rodriguez-Moreno, Jose Maria Martinez-Otzeta, Basilio Sierra
Summary: Communication between people from different communities can sometimes be hindered by language barriers. To assist language learners, a tutor for learning Spanish Sign Language has been developed. This tutor uses a webcam to capture the user's image and provides real-time feedback to help the user improve.
EXPERT SYSTEMS WITH APPLICATIONS
(2022)
Review
Computer Science, Information Systems
Zeyu Liang, Huailing Li, Jianping Chai
Summary: Sign language is the main communication method for DHH people, and there is a need to bridge the communication gap between DHH and non-DHH individuals. This article summarizes the research progress on sign language translation, including its background, subtasks, basic mode, and transformer-based framework. The challenges of sign language translation are analyzed, and potential directions for its development are proposed.
Article
Physics, Multidisciplinary
Jing Zhao, Yi Zhang, Shiliang Sun, Haiwei Dai
Summary: The Hidden Markov model is crucial for trajectory recognition, and the sampled BP-HMM model, while effective, is inconvenient for classification and slow to converge. To improve trajectory recognition performance, a novel variational BP-HMM model has been proposed, which can share information among different classes.
Article
Automation & Control Systems
El-Sayed M. El-Alfy, Hamzah Luqman
Summary: Sign language, as a means of communication relying on visual gestures of human body parts, plays a vital role in modern society. In recent years, automated sign language processing has attracted growing attention. This survey presents a comprehensive review of the state-of-the-art literature, covering issues such as sign acquisition, segmentation, recognition, translation, and linguistic structures. It also discusses recent advances in deep machine learning and multimodal approaches, providing insights for researchers and developers in the field.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2022)
Article
Automation & Control Systems
El-Sayed M. El-Alfy, Hamzah Luqman
Summary: Sign language is crucial in modern society for communication with people who have hearing difficulties. Recent research in automated sign language processing covers a wide range of topics including sign acquisition, recognition, and translation, utilizing technologies such as deep machine learning and multimodal approaches.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Rui Lv, Dingheng Wang, Jiangbin Zheng, Zhao-Xu Yang
Summary: In this paper, the authors investigate tensor decomposition for neural network compression. They analyze the convergence and precision of tensor mapping theory, validate the rationality of tensor mapping and its superiority over traditional tensor approximation based on the Lottery Ticket Hypothesis. They propose an efficient method called 3D-KCPNet to compress 3D convolutional neural networks using the Kronecker canonical polyadic (KCP) tensor decomposition. Experimental results show that 3D-KCPNet achieves higher accuracy compared to the original baseline model and the corresponding tensor approximation model.
Article
Computer Science, Artificial Intelligence
Xiangkun He, Zhongxu Hu, Haohan Yang, Chen Lv
Summary: In this paper, a novel constrained multi-objective reinforcement learning algorithm is proposed for personalized end-to-end robotic control with continuous actions. The approach trains a single model using constraint design and a comprehensive index to achieve optimal policies based on user-specified preferences.
Article
Computer Science, Artificial Intelligence
Zhijian Zhuo, Bilian Chen, Shenbao Yu, Langcai Cao
Summary: In this paper, a novel method called Expansion with Contraction Method for Overlapping Community Detection (ECOCD) is proposed, which utilizes non-negative matrix factorization to obtain disjoint communities and applies expansion and contraction processes to adjust the degree of overlap. ECOCD is applicable to various networks with different properties and achieves high-quality overlapping community detection.
Article
Computer Science, Artificial Intelligence
Yizhe Zhu, Chunhui Zhang, Jialin Gao, Xin Sun, Zihan Rui, Xi Zhou
Summary: In this work, the authors propose a Contrastive Spatio-Temporal Distilling (CSTD) approach to improve the detection of high-compressed deepfake videos. The approach leverages spatial-frequency cues and temporal-contrastive alignment to fully exploit spatiotemporal inconsistency information.
Review
Computer Science, Artificial Intelligence
Laijin Meng, Xinghao Jiang, Tanfeng Sun
Summary: This paper provides a review of coverless steganographic algorithms, including the development process, known contributions, and general issues in image and video algorithms. It also discusses the security of coverless steganography from theoretical analysis to actual investigation for the first time.
Article
Computer Science, Artificial Intelligence
Yajie Bao, Tianwei Xing, Xun Chen
Summary: Visual question answering requires processing multi-modal information and effective reasoning. Neural-symbolic learning is a promising method, but current approaches lack uncertainty handling and can only provide a single answer. To address this, we propose a confidence based neural-symbolic approach that evaluates NN inferences and conducts reasoning based on confidence.
Article
Computer Science, Artificial Intelligence
Anh H. Vo, Bao T. Nguyen
Summary: Interior style classification is an interesting problem with potential applications in both commercial and academic domains. This project proposes a method named ISC-DeIT, which combines data-efficient image transformer architectures and knowledge distillation, to address the interior style classification problem. Experimental results demonstrate a significant improvement in predictive accuracy compared to other state-of-the-art methods.
Article
Computer Science, Artificial Intelligence
Shashank Kotyan, Danilo Vasconcellos Vargas
Summary: This article introduces a novel augmentation technique called Dynamic Scanning Augmentation to improve the accuracy and robustness of Vision Transformer (ViT). The technique leverages dynamic input sequences to adaptively focus on different patches, resulting in significant changes in ViT's attention mechanism. Experimental results demonstrate that Dynamic Scanning Augmentation outperforms ViT in terms of both robustness to adversarial attacks and accuracy against natural images.
Article
Computer Science, Artificial Intelligence
Hiba Alqasir, Damien Muselet, Christophe Ducottet
Summary: The article proposes a solution to improve the learning process of a classification network by providing shape priors, reducing the need for annotated data. The solution is tested on cross-domain digit classification tasks and a video surveillance application.
Article
Computer Science, Artificial Intelligence
Dexiu Ma, Mei Liu, Mingsheng Shang
Summary: This paper proposes a method using neural dynamics solvers to solve infinity-norm optimization problems. Two improved solvers are constructed and their effectiveness and superiority are demonstrated through theoretical analysis and simulation experiments.
Article
Computer Science, Artificial Intelligence
Francesco Gregoretti, Giovanni Pezzulo, Domenico Maisto
Summary: Active Inference is a computational framework that uses probabilistic inference and variational free energy minimization to describe perception, planning, and action. cpp-AIF is a header-only C++ library that provides a powerful tool for implementing Active Inference for Partially Observable Markov Decision Processes through multi-core computing. It is cross-platform and improves performance, memory management, and usability compared to existing software.
Article
Computer Science, Artificial Intelligence
Zelin Ying, Dawei Cheng, Cen Chen, Xiang Li, Peng Zhu, Yifeng Luo, Yuqi Liang
Summary: This paper proposes a novel stock market trends prediction framework called SMART, which includes a self-supervised stock technical data sequence embedding model S3E. By training with multiple self-supervised auxiliary tasks, the model encodes stock technical data sequences into embeddings and uses the learned sequence embeddings for predicting stock market trends. Extensive experiments on China A-Shares market and NASDAQ market prove the high effectiveness of our model in stock market trends prediction, and its effectiveness is further validated in real-world applications in a leading financial service provider in China.
Article
Computer Science, Artificial Intelligence
Hao Li, Hao Jiang, Dongsheng Ye, Qiang Wang, Liang Du, Yuanyuan Zeng, Liu Yuan, Yingxue Wang, C. Chen
Summary: DHGAT1, a dynamic hyperbolic graph attention network, utilizes hyperbolic metric properties to embed dynamic graphs. It employs a spatiotemporal self-attention mechanism and weighted node representations, resulting in excellent performance in link prediction tasks.
Article
Computer Science, Artificial Intelligence
Jiehui Huang, Zhenchao Tang, Xuedong He, Jun Zhou, Defeng Zhou, Calvin Yu-Chian Chen
Summary: This study proposes a progressive learning multi-scale feature blending model for image deraining tasks. The model utilizes detail dilation and texture extraction to improve the restoration of rainy images. Experimental results show that the model achieves near state-of-the-art performance in rain removal tasks and exhibits better rain removal realism.
Article
Computer Science, Artificial Intelligence
Lizhi Liu, Zilin Gao, Yinhe Wang, Yongfu Li
Summary: This paper proposes a novel discrete-time interconnected model for depicting complex dynamical networks. The model consists of nodes and edges subsystems, which consider the dynamic characteristic of both nodes and edges. By designing control strategies and coupling modes, the stabilization and synchronization of the network are achieved. Simulation results demonstrate the effectiveness of the proposed methods.