Article
Computer Science, Information Systems
Tasweer Ahmad, Syed Tahir Hussain Rizvi, Neel Kanwal
Summary: In this paper, a novel idea of action embedding with a self-attention Transformer network is proposed for skeleton-based action recognition, which effectively models the latent information of skeleton data and captures both spatial and temporal features of joints. Experimental results on SYSU-3D, NTU-RGB+D, and NTU-RGB+D 120 datasets demonstrate that our method outperforms other state-of-the-art architectures.
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION
(2023)
Article
Computer Science, Artificial Intelligence
Ping Yang, Qin Wang, Hao Chen, Zizhao Wu
Summary: A novel position-aware spatio-temporal GCN is proposed for skeleton-based action recognition, which investigates positional encoding to enhance the capacity of typical baselines for comprehending action sequence characteristics. The method systematically investigates temporal position encoding, spatial position embedding, and a subgraph mask to capture sequence ordering information, identity information, and mine prominent subgraph patterns respectively. Extensive experiments show that the proposed model achieves competitive results compared to previous state-of-the-art methods.
IET COMPUTER VISION
(2023)
Article
Chemistry, Multidisciplinary
Jianing Sun, Xuan Wu, Yubin Xiao, Chunguo Wu, Yanchun Liang, Yi Liang, Liupu Wang, You Zhou
Summary: This paper proposes two attention mechanisms, namely multi-headed local self-attention (MLSA) and max-average pooling attention (MA), to extract both local and global features simultaneously, and enhance collaboration between MLSA and MA through the double attention block (DABlock). The final DANet network, composed of DABlocks and other advanced blocks, outperforms other state-of-the-art models on all datasets according to experimental results.
APPLIED SCIENCES-BASEL
(2023)
Article
Computer Science, Artificial Intelligence
Kanchan Keisham, Amin Jalali, Minho Lee
Summary: This study proposes a novel spatio-temporal attention network for online action proposal generation, which can generate precise action boundaries and handle noisy features effectively, suitable for online tasks.
Article
Computer Science, Artificial Intelligence
Jingzheng Tu, Cailian Chen, Xiaolin Huang, Jianping He, Xinping Guan
Summary: Vehicle re-identification is a crucial technology for discovering and matching target vehicles in images taken by different cameras. This paper proposes a discriminative feature representation with spatio-temporal clues, which achieves superior performance compared to existing methods.
PATTERN RECOGNITION
(2022)
Article
Chemistry, Analytical
Yilong He, Xiao Han, Yong Zhong, Lishun Wang
Summary: The study proposes a non-local temporal difference network (NTD) for temporal action detection in videos, utilizing chunk convolution, multiple temporal coordination, and temporal difference modules. Experimental results demonstrate that NTD achieves state-of-the-art performance on multiple datasets.
Article
Robotics
Younggi Hong, Min Ju Kim, Isack Lee, Seok Bong Yoo
Summary: This paper presents an innovative model for action recognition that overcomes the limitations of using transformer structures in action recognition models. By employing a duplex attention mechanism and flow-guided information, the proposed model achieves higher accuracy in action recognition.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2023)
Article
Computer Science, Artificial Intelligence
Tasweer Ahmad, Lianwen Jin, Luojun Lin, GuoZhi Tang
Summary: This paper introduces techniques of graph sparsification and self-attention graph pooling to address issues in skeleton-based action recognition, achieving state-of-the-art results.
Article
Agriculture, Multidisciplinary
Fan Meng, Jinhui Li, Yunqi Zhang, Shaojun Qi, Yunchao Tang
Summary: Automated pineapple harvesting is a significant development in the agricultural field. However, traditional real-time detection algorithms face challenges in accurately detecting pineapples due to their complex growth conditions. Recent studies have shown the potential of Transformer models in computer vision applications. In this study, a spatio-temporal convolutional neural network model is proposed for pineapple detection, achieving an impressive accuracy rate of 92.54% and an average inference time of 0.163 s.
COMPUTERS AND ELECTRONICS IN AGRICULTURE
(2023)
Article
Computer Science, Artificial Intelligence
Zhaoqilin Yang, Gaoyun An, Ruichen Zhang, Zhenxing Zheng, Qiuqi Ruan
Summary: This paper proposes a novel two-stream inflated 3D ConvNet based on sparse regularization (SRI3D) for action recognition. The l(1) norm is embedded in the loss function to allow the network to learn the sparsity of the output. Experimental results show that SRI3D has a competitive advantage on Kinetics-400, Something-Something V2, UCF-101, and HMDB-51 compared to other state-of-the-art models.
IET IMAGE PROCESSING
(2023)
Article
Computer Science, Information Systems
Ziqiang Wang, Zhi Liu, Gongyang Li, Yang Wang, Tianhong Zhang, Lihua Xu, Jijun Wang
Summary: In this paper, the authors propose a novel Spatio-Temporal Self-Attention 3D Network (STSANet) for video saliency prediction, which overcomes the limitation of 3D convolution in encoding visual representation based on fixed local spacetime. The proposed model utilizes Spatio-Temporal Self-Attention (STSA) modules and Attentional Multi-Scale Fusion (AMSF) module to capture long-range relations between spatio-temporal features and integrate multi-level features with context perception. Extensive experiments demonstrate the superiority of the proposed model compared to state-of-the-art models on DHF1K, Hollywood-2, UCF, and DIEM benchmark datasets.
IEEE TRANSACTIONS ON MULTIMEDIA
(2023)
Article
Computer Science, Artificial Intelligence
Bogyeong Lee, Sungkook Hong, Hyunsoo Kim
Summary: Despite automation reducing the number of workers in construction, worker safety remains a crucial issue. Efforts have been made to monitor safety behaviors with additional personnel, but existing methods struggle to capture workers' compliance. This study proposes an approach based on OpenPose and a spatio-temporal graph convolutional network to evaluate workers' compliance with safety regulations and provide behavior-based feedback.
ADVANCED ENGINEERING INFORMATICS
(2023)
Article
Computer Science, Artificial Intelligence
Sudhakar Kumawat, Manisha Verma, Yuta Nakashima, Shanmuganathan Raman
Summary: This study introduces a new class of convolutional blocks, STFT blocks, as an alternative to the conventional 3D convolutional layer in CNNs. These blocks effectively reduce computational complexity, parameters, and enhance feature learning capabilities. Extensive evaluation on seven action recognition datasets demonstrates that 3D CNNs with STFT blocks achieve comparable or even superior performance compared to state-of-the-art methods.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Honglei Zhu, Pengjuan Wei, Zhigang Xu
Summary: Due to the robustness of skeleton data, great progress has been made in skeleton-based video anomaly detection. To address the limitations of traditional models, this paper proposes a model called STEGT-AE, which applies Transformer and autoencoder to improve the capability of modeling and detection performance. The experimental results show that STEGT-AE outperforms other algorithms on four baseline datasets.
IET COMPUTER VISION
(2023)
Article
Computer Science, Artificial Intelligence
Dasom Ahn, Sangwon Kim, Byoung Chul Ko
Summary: This paper proposes an improved spatio-temporal cross attention transformer model (STAR++), which can effectively recognize various actions in videos. STAR++ innovates in the encoder structure and the application of interval attention, and optimizes the learning efficiency of attention operations through deformable 3D token selection.
APPLIED INTELLIGENCE
(2023)
Article
Computer Science, Information Systems
Pau Climent-Perez, Francisco Florez-Revuelta
Summary: This paper introduces an RGB-based visual privacy preservation filter that utilizes deep learning technology, as well as a background update scheme to limit sensitive information leakage. A comparative study shows that the union of dilated masks from different deep networks achieves the best overall result in privacy preservation.
MULTIMEDIA TOOLS AND APPLICATIONS
(2021)
Article
Chemistry, Multidisciplinary
Francisco Florez-Revuelta
Summary: This paper introduces EvoSplit, a new evolutionary approach for distributing multi-label data sets into disjoint subsets for supervised machine learning. By utilizing single-objective and multi-objective evolutionary algorithms, it aims to maximize the similarity between different distributions. Results show that EvoSplit improves data set splitting compared to iterative stratification across different measures.
APPLIED SCIENCES-BASEL
(2021)
Article
Chemistry, Analytical
Pau Climent-Perez, Francisco Florez-Revuelta
Summary: This study utilizes a many-objective evolutionary algorithm to maximize action recognition while concealing gender and age information. The results demonstrate a decrease in gender and age recognition to 58% and 39%, respectively, while action recognition remains closer to the initial value of 68%.
Article
Multidisciplinary Sciences
Pau Climent-Perez, Angela M. Munoz-Anton, Angelica Poli, Susanna Spinsante, Francisco Florez-Revuelta
Summary: Several research studies have explored human activity recognition (HAR) in order to detect and recognize patterns of daily human activities. However, accurately and automatically assessing activities of daily living (ADLs) using machine learning algorithms remains a challenge, mainly due to limited availability of realistic datasets for training and testing. This dataset includes data from 52 participants (26 women and 26 men) and provides an annotated description of the data collected using a wrist-worn measurement device, Empatica E4. The authors believe that sharing this dataset will greatly benefit the research community, particularly those involved in ADL recognition or the removal of identity cues.
Article
Environmental Sciences
Jesus Fernandez-Bermejo Ruiz, Javier Dorado Chaparro, Maria Jose Santofimia Romero, Felix Jesus Villanueva Molina, Xavier del Toro Garcia, Cristina Bolanos Peno, Henry Llumiguano Solano, Sara Colantonio, Francisco Florez-Revuelta, Juan Carlos Lopez
Summary: With increased life expectancy, the number of people in need of intensive care and attention is growing. Falls are a major concern for older adults, being the second leading cause of unintentional death globally. Lack of widespread solutions for fall detection and prevention is mainly due to privacy concerns, high cost, low performance, and discomfort of wearable devices. This paper presents a solution focused on monitoring bed position to detect risk situations and combined with an automatic fall detection system. Experimental validation demonstrates high accuracy in fall detection and recognition of risk situations.
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH
(2022)
Correction
Health Care Sciences & Services
Mladjan Jovanovic, Goran Mitrov, Eftim Zdravevski, Petre Lameski, Sara Colantonio, Martin Kampel, Hilda Tellioglu, Francisco Florez-Revuelta
JOURNAL OF MEDICAL INTERNET RESEARCH
(2022)
Review
Health Care Sciences & Services
Mladjan Jovanovic, Goran Mitrov, Eftim Zdravevski, Petre Lameski, Sara Colantonio, Martin Kampel, Hilda Tellioglu, Francisco Florez-Revuelta
Summary: This study presents a scoping review of AI models in Ambient Assisted Living (AAL), analyzing the specific models used, target domains, technology, and concerns from the end-user perspective. The findings provide insights for the development, deployment, and evaluation of future intelligent AAL systems.
JOURNAL OF MEDICAL INTERNET RESEARCH
(2022)
Review
Health Care Sciences & Services
Tamara Mujirishvili, Caterina Maidhof, Francisco Florez-Revuelta, Martina Ziefle, Miguel Richart-Martinez, Julio Cabrero-Garcia
Summary: This article provides a scoping review of studies examining the viewpoints of older adults and/or their caregivers on the acceptance and privacy perceptions of video-based active and assisted living (AAL) technology. The findings suggest that acceptance attitudes towards video-based AAL technologies are conditional and influenced by privacy concerns. Security and medical safety were identified as the major benefits of these technologies.
JOURNAL OF MEDICAL INTERNET RESEARCH
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Nahuel E. Garcia-D'Urso, Alejandro Galan-Cuenca, Pau Climent-Perez, Marcelo Saval-Calvo, Jorge Azorin-Lopez, Andres Fuster-Guillo
Summary: The overexploitation of seas and oceans has caused a loss of marine biodiversity and poses a challenge for the fishing industries. This paper introduces an automated monitoring system based on computer vision and deep learning to identify fish species.
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Kooshan Hashemifard, Francisco Florez-Revuelta
Summary: This paper proposes a method for extracting skin pixels from garment segmentation and recognition datasets, and introduces a large human skin segmentation dataset by using deep learning methods to generate automatic skin label masks. Finally, the skin detection and segmentation methods are evaluated on this dataset.
IMAGE ANALYSIS AND PROCESSING, ICIAP 2022 WORKSHOPS, PT I
(2022)
Article
Computer Science, Information Systems
Nahuel Garcia-D'urso, Pau Climent-Perez, Miriam Sanchez-Sansegundo, Ana Zaragoza-Marti, Andres Fuster-Guillo, Jorge Azorin-Lopez
Summary: Artificial intelligence techniques have been increasingly used in healthcare, including predicting cholesterol levels using machine learning approaches and identifying potential diagnosis or prognosis information through clustering analysis.