☆ 4.6 Article

Improved Action Recognition with Separable Spatio-Temporal Attention Using Alternative Skeletal and Video Pre-Processing

SENSORS (2021)

期刊

SENSORS

卷 21, 期 3, 页码 -

出版社

MDPI

DOI: 10.3390/s21031005

关键词

active and assisted living; action recognition; computer vision; spatio-temporal attention; deep learning; inflated convolutional neural networks

类别

Chemistry, Analytical Engineering, Electrical & Electronic Instruments & Instrumentation

资金

Joint Programme Initiative More Years, Better Lives (JPI MYBL) [PAAL_JTC2017]
Spanish Agencia Estatal de Investigacion [PCIN-2017-114]

向作者/读者索取更多资源

Protocol

Reagent

智能总结 New
摘要

The potential benefits of recognizing activities of daily living from video have not been fully tapped, with technologies also useful for behavior understanding and lifelogging for caregivers. A proposed separable spatio-temporal attention network and normalization of pose data improve results by 9.5%, surpassing state-of-the-art techniques.

The potential benefits of recognising activities of daily living from video for active and assisted living have yet to be fully untapped. These technologies can be used for behaviour understanding, and lifelogging for caregivers and end users alike. The recent publication of realistic datasets for this purpose, such as the Toyota Smarthomes dataset, calls for pushing forward the efforts to improve action recognition. Using the separable spatio-temporal attention network proposed in the literature, this paper introduces a view-invariant normalisation of skeletal pose data and full activity crops for RGB data, which improve the baseline results by 9.5% (on the cross-subject experiments), outperforming state-of-the-art techniques in this field when using the original unmodified skeletal data in dataset. Our code and data are available online.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Information Systems

Transforming spatio-temporal self-attention using action embedding for skeleton-based action recognition

Tasweer Ahmad, Syed Tahir Hussain Rizvi, Neel Kanwal

Summary: In this paper, a novel idea of action embedding with a self-attention Transformer network is proposed for skeleton-based action recognition, which effectively models the latent information of skeleton data and captures both spatial and temporal features of joints. Experimental results on SYSU-3D, NTU-RGB+D, and NTU-RGB+D 120 datasets demonstrate that our method outperforms other state-of-the-art architectures.

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Position-aware spatio-temporal graph convolutional networks for skeleton-based action recognition

Ping Yang, Qin Wang, Hao Chen, Zizhao Wu

Summary: A novel position-aware spatio-temporal GCN is proposed for skeleton-based action recognition, which investigates positional encoding to enhance the capacity of typical baselines for comprehending action sequence characteristics. The method systematically investigates temporal position encoding, spatial position embedding, and a subgraph mask to capture sequence ordering information, identity information, and mine prominent subgraph patterns respectively. Extensive experiments show that the proposed model achieves competitive results compared to previous state-of-the-art methods.

IET COMPUTER VISION (2023)

添加到收藏夹

Article Chemistry, Multidisciplinary

DANet: Temporal Action Localization with Double Attention

Jianing Sun, Xuan Wu, Yubin Xiao, Chunguo Wu, Yanchun Liang, Yi Liang, Liupu Wang, You Zhou

Summary: This paper proposes two attention mechanisms, namely multi-headed local self-attention (MLSA) and max-average pooling attention (MA), to extract both local and global features simultaneously, and enhance collaboration between MLSA and MA through the double attention block (DABlock). The final DANet network, composed of DABlocks and other advanced blocks, outperforms other state-of-the-art models on all datasets according to experimental results.

APPLIED SCIENCES-BASEL (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Online action proposal generation using spatio-temporal attention network

Kanchan Keisham, Amin Jalali, Minho Lee

Summary: This study proposes a novel spatio-temporal attention network for online action proposal generation, which can generate precise action boundaries and handle noisy features effectively, suitable for online tasks.

NEURAL NETWORKS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

DFR-ST: Discriminative feature representation with spatio-temporal cues for vehicle re-identification

Jingzheng Tu, Cailian Chen, Xiaolin Huang, Jianping He, Xinping Guan

Summary: Vehicle re-identification is a crucial technology for discovering and matching target vehicles in images taken by different cameras. This paper proposes a discriminative feature representation with spatio-temporal clues, which achieves superior performance compared to existing methods.

PATTERN RECOGNITION (2022)

添加到收藏夹

Article Chemistry, Analytical

Non-Local Temporal Difference Network for Temporal Action Detection

Yilong He, Xiao Han, Yong Zhong, Lishun Wang

Summary: The study proposes a non-local temporal difference network (NTD) for temporal action detection in videos, utilizing chunk convolution, multiple temporal coordination, and temporal difference modules. Experimental results demonstrate that NTD achieves state-of-the-art performance on multiple datasets.

SENSORS (2022)

添加到收藏夹

Article Robotics

Fluxformer: Flow-Guided Duplex Attention Transformer via Spatio-Temporal Clustering for Action Recognition

Younggi Hong, Min Ju Kim, Isack Lee, Seok Bong Yoo

Summary: This paper presents an innovative model for action recognition that overcomes the limitations of using transformer structures in action recognition models. By employing a duplex attention mechanism and flow-guided information, the proposed model achieves higher accuracy in action recognition.

IEEE ROBOTICS AND AUTOMATION LETTERS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Skeleton-based action recognition using sparse spatio-temporal GCN with edge effective resistance

Tasweer Ahmad, Lianwen Jin, Luojun Lin, GuoZhi Tang

Summary: This paper introduces techniques of graph sparsification and self-attention graph pooling to address issues in skeleton-based action recognition, achieving state-of-the-art results.

NEUROCOMPUTING (2021)

添加到收藏夹

Article Agriculture, Multidisciplinary

Transforming unmanned pineapple picking with spatio-temporal convolutional neural networks

Fan Meng, Jinhui Li, Yunqi Zhang, Shaojun Qi, Yunchao Tang

Summary: Automated pineapple harvesting is a significant development in the agricultural field. However, traditional real-time detection algorithms face challenges in accurately detecting pineapples due to their complex growth conditions. Recent studies have shown the potential of Transformer models in computer vision applications. In this study, a spatio-temporal convolutional neural network model is proposed for pineapple detection, achieving an impressive accuracy rate of 92.54% and an average inference time of 0.163 s.

COMPUTERS AND ELECTRONICS IN AGRICULTURE (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

SRI3D: Two-stream inflated 3D ConvNet based on sparse regularization for action recognition

Zhaoqilin Yang, Gaoyun An, Ruichen Zhang, Zhenxing Zheng, Qiuqi Ruan

Summary: This paper proposes a novel two-stream inflated 3D ConvNet based on sparse regularization (SRI3D) for action recognition. The l(1) norm is embedded in the loss function to allow the network to learn the sparsity of the output. Experimental results show that SRI3D has a competitive advantage on Kinetics-400, Something-Something V2, UCF-101, and HMDB-51 compared to other state-of-the-art models.

IET IMAGE PROCESSING (2023)

添加到收藏夹

Article Computer Science, Information Systems

Spatio-Temporal Self-Attention Network for Video Saliency Prediction

Ziqiang Wang, Zhi Liu, Gongyang Li, Yang Wang, Tianhong Zhang, Lihua Xu, Jijun Wang

Summary: In this paper, the authors propose a novel Spatio-Temporal Self-Attention 3D Network (STSANet) for video saliency prediction, which overcomes the limitation of 3D convolution in encoding visual representation based on fixed local spacetime. The proposed model utilizes Spatio-Temporal Self-Attention (STSA) modules and Attentional Multi-Scale Fusion (AMSF) module to capture long-range relations between spatio-temporal features and integrate multi-level features with context perception. Extensive experiments demonstrate the superiority of the proposed model compared to state-of-the-art models on DHF1K, Hollywood-2, UCF, and DIEM benchmark datasets.

IEEE TRANSACTIONS ON MULTIMEDIA (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Determination of workers? compliance to safety regulations using a spatio-temporal graph convolution network

Bogyeong Lee, Sungkook Hong, Hyunsoo Kim

Summary: Despite automation reducing the number of workers in construction, worker safety remains a crucial issue. Efforts have been made to monitor safety behaviors with additional personnel, but existing methods struggle to capture workers' compliance. This study proposes an approach based on OpenPose and a spatio-temporal graph convolutional network to evaluate workers' compliance with safety regulations and provide behavior-based feedback.

ADVANCED ENGINEERING INFORMATICS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Depthwise Spatio-Temporal STFT Convolutiona Neural Networks for Human Action Recognition

Sudhakar Kumawat, Manisha Verma, Yuta Nakashima, Shanmuganathan Raman

Summary: This study introduces a new class of convolutional blocks, STFT blocks, as an alternative to the conventional 3D convolutional layer in CNNs. These blocks effectively reduce computational complexity, parameters, and enhance feature learning capabilities. Extensive evaluation on seven action recognition datasets demonstrates that 3D CNNs with STFT blocks achieve comparable or even superior performance compared to state-of-the-art methods.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A Spatio-Temporal Enhanced Graph-Transformer AutoEncoder embedded pose for anomaly detection

Honglei Zhu, Pengjuan Wei, Zhigang Xu

Summary: Due to the robustness of skeleton data, great progress has been made in skeleton-based video anomaly detection. To address the limitations of traditional models, this paper proposes a model called STEGT-AE, which applies Transformer and autoencoder to improve the capability of modeling and detection performance. The experimental results show that STEGT-AE outperforms other algorithms on four baseline datasets.

IET COMPUTER VISION (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

STAR plus plus : Rethinking spatio-temporal cross attention transformer for video action recognition

Dasom Ahn, Sangwon Kim, Byoung Chul Ko

Summary: This paper proposes an improved spatio-temporal cross attention transformer model (STAR++), which can effectively recognize various actions in videos. STAR++ innovates in the encoder structure and the application of interval attention, and optimizes the learning efficiency of attention operations through deformable 3D token selection.

APPLIED INTELLIGENCE (2023)

添加到收藏夹

Article Computer Science, Information Systems

Protection of visual privacy in videos acquired with RGB cameras for active and assisted living applications

Pau Climent-Perez, Francisco Florez-Revuelta

Summary: This paper introduces an RGB-based visual privacy preservation filter that utilizes deep learning technology, as well as a background update scheme to limit sensitive information leakage. A comparative study shows that the union of dilated masks from different deep networks achieves the best overall result in privacy preservation.

MULTIMEDIA TOOLS AND APPLICATIONS (2021)

添加到收藏夹

Article Chemistry, Multidisciplinary

EvoSplit: An Evolutionary Approach to Split a Multi-Label Data Set into Disjoint Subsets

Francisco Florez-Revuelta

Summary: This paper introduces EvoSplit, a new evolutionary approach for distributing multi-label data sets into disjoint subsets for supervised machine learning. By utilizing single-objective and multi-objective evolutionary algorithms, it aims to maximize the similarity between different distributions. Results show that EvoSplit improves data set splitting compared to iterative stratification across different measures.

APPLIED SCIENCES-BASEL (2021)

添加到收藏夹

Article Chemistry, Analytical

Privacy-Preserving Human Action Recognition with a Many-Objective Evolutionary Algorithm

Pau Climent-Perez, Francisco Florez-Revuelta

Summary: This study utilizes a many-objective evolutionary algorithm to maximize action recognition while concealing gender and age information. The results demonstrate a decrease in gender and age recognition to 58% and 39%, respectively, while action recognition remains closer to the initial value of 68%.

SENSORS (2022)

添加到收藏夹

Article Multidisciplinary Sciences

Dataset of acceleration signals recorded while performing activities of daily living

Pau Climent-Perez, Angela M. Munoz-Anton, Angelica Poli, Susanna Spinsante, Francisco Florez-Revuelta

Summary: Several research studies have explored human activity recognition (HAR) in order to detect and recognize patterns of daily human activities. However, accurately and automatically assessing activities of daily living (ADLs) using machine learning algorithms remains a challenge, mainly due to limited availability of realistic datasets for training and testing. This dataset includes data from 52 participants (26 women and 26 men) and provides an annotated description of the data collected using a wrist-worn measurement device, Empatica E4. The authors believe that sharing this dataset will greatly benefit the research community, particularly those involved in ADL recognition or the removal of identity cues.

DATA IN BRIEF (2022)

添加到收藏夹

Article Environmental Sciences

Bedtime Monitoring for Fall Detection and Prevention in Older Adults

Jesus Fernandez-Bermejo Ruiz, Javier Dorado Chaparro, Maria Jose Santofimia Romero, Felix Jesus Villanueva Molina, Xavier del Toro Garcia, Cristina Bolanos Peno, Henry Llumiguano Solano, Sara Colantonio, Francisco Florez-Revuelta, Juan Carlos Lopez

Summary: With increased life expectancy, the number of people in need of intensive care and attention is growing. Falls are a major concern for older adults, being the second leading cause of unintentional death globally. Lack of widespread solutions for fall detection and prevention is mainly due to privacy concerns, high cost, low performance, and discomfort of wearable devices. This paper presents a solution focused on monitoring bed position to detect risk situations and combined with an automatic fall detection system. Experimental validation demonstrates high accuracy in fall detection and recognition of risk situations.

INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH (2022)

添加到收藏夹

Correction Health Care Sciences & Services

Ambient Assisted Living: Scoping Review of Artificial Intelligence Models, Domains, Technology, and Concerns (vol 24, e36553, 2022)

Mladjan Jovanovic, Goran Mitrov, Eftim Zdravevski, Petre Lameski, Sara Colantonio, Martin Kampel, Hilda Tellioglu, Francisco Florez-Revuelta

JOURNAL OF MEDICAL INTERNET RESEARCH (2022)

添加到收藏夹

Review Health Care Sciences & Services

Ambient Assisted Living: Scoping Review of Artificial Intelligence Models, Domains, Technology, and Concerns

Mladjan Jovanovic, Goran Mitrov, Eftim Zdravevski, Petre Lameski, Sara Colantonio, Martin Kampel, Hilda Tellioglu, Francisco Florez-Revuelta

Summary: This study presents a scoping review of AI models in Ambient Assisted Living (AAL), analyzing the specific models used, target domains, technology, and concerns from the end-user perspective. The findings provide insights for the development, deployment, and evaluation of future intelligent AAL systems.

JOURNAL OF MEDICAL INTERNET RESEARCH (2022)

添加到收藏夹

Review Health Care Sciences & Services

Acceptance and Privacy Perceptions Toward Video-based Active and Assisted Living Technologies: Scoping Review

Tamara Mujirishvili, Caterina Maidhof, Francisco Florez-Revuelta, Martina Ziefle, Miguel Richart-Martinez, Julio Cabrero-Garcia

Summary: This article provides a scoping review of studies examining the viewpoints of older adults and/or their caregivers on the acceptance and privacy perceptions of video-based active and assisted living (AAL) technology. The findings suggest that acceptance attitudes towards video-based AAL technologies are conditional and influenced by privacy concerns. Security and medical safety were identified as the major benefits of these technologies.

JOURNAL OF MEDICAL INTERNET RESEARCH (2023)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

Efficient instance segmentation using deep learning for species identification in fish markets

Nahuel E. Garcia-D'Urso, Alejandro Galan-Cuenca, Pau Climent-Perez, Marcelo Saval-Calvo, Jorge Azorin-Lopez, Andres Fuster-Guillo

Summary: The overexploitation of seas and oceans has caused a loss of marine biodiversity and poses a challenge for the fishing industries. This paper introduces an automated monitoring system based on computer vision and deep learning to identify fish species.

2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) (2022)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

From Garment to Skin: The visuAAL Skin Segmentation Dataset

Kooshan Hashemifard, Francisco Florez-Revuelta

Summary: This paper proposes a method for extracting skin pixels from garment segmentation and recognition datasets, and introduces a large human skin segmentation dataset by using deep learning methods to generate automatic skin label masks. Finally, the skin detection and segmentation methods are evaluated on this dataset.

IMAGE ANALYSIS AND PROCESSING, ICIAP 2022 WORKSHOPS, PT I (2022)

添加到收藏夹

Article Computer Science, Information Systems

A Non-Invasive Approach for Total Cholesterol Level Prediction Using Machine Learning

Nahuel Garcia-D'urso, Pau Climent-Perez, Miriam Sanchez-Sansegundo, Ana Zaragoza-Marti, Andres Fuster-Guillo, Jorge Azorin-Lopez

Summary: Artificial intelligence techniques have been increasingly used in healthcare, including predicting cholesterol levels using machine learning approaches and identifying potential diagnosis or prognosis information through clustering analysis.

IEEE ACCESS (2022)

添加到收藏夹

暂无数据

© Peeref 2019-2024. All rights reserved.