Article
Computer Science, Artificial Intelligence
Jian Dong, Wankou Yang, Yazhou Yao, Fatih Porikli
Summary: Human action recognition in visual data is a fundamental challenge in computer vision, with existing approaches mainly based on video data. This paper introduces a novel method that transfers knowledge from action videos to images for recognizing actions in still images. Results show that transferred knowledge from color and motion flow sequences can significantly improve the performance of still image based human action recognition.
PATTERN RECOGNITION
(2021)
Article
Ecology
Luciano Araujo Dourado-Filho, Rodrigo Tripodi Calumby
Summary: The study compares the effectiveness of multiple pre-trained Deep Convolutional Neural Networks in extracting deep features from images of multi-organ plant observations and evaluates it using SVM classifiers, revealing the importance of experimental assessment for maximizing classification accuracy.
ECOLOGICAL INFORMATICS
(2021)
Article
Computer Science, Artificial Intelligence
Umar Asif, Deval Mehta, Stefan Von Cavallar, Jianbin Tang, Stefan Harrer
Summary: This paper presents a holistic framework for video-based action recognition by combining spatial and motion features from the body, face, and hands. The proposed Deep Actions Stamps (DeepActs) encode effective spatio-temporal features and improve action recognition accuracy compared to methods based on limited body joints. The DeepActsNet, a deep learning based ensemble model, achieves highly accurate action recognition with less computational cost.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Information Systems
Ge Yang, Wu-xing Zou
Summary: This paper proposes a deep learning network model based on fusion of spatio-temporal features (FSTFN) to improve the recognition accuracy in action recognition tasks by extracting and fusing time and space information, processing large-scale video frame information using multi-segment input, and enhancing the weight of visual subjects through the attention mechanism.
MULTIMEDIA TOOLS AND APPLICATIONS
(2022)
Article
Computer Science, Artificial Intelligence
Y. L. Chang, C. S. Chan, P. Remagnino
Summary: The paper proposes a novel framework called LVAR for generic action classification in videos, which introduces a partial recurrence connection for propagating information within each layer. This framework improves action recognition performance by accessing long-term information in videos of different lengths.
NEURAL COMPUTING & APPLICATIONS
(2021)
Article
Chemistry, Analytical
Seemab Khan, Muhammad Attique Khan, Majed Alhaisoni, Usman Tariq, Hwan-Seung Yong, Ammar Armghan, Fayadh Alenezi
Summary: Human action recognition (HAR) is crucial for smart surveillance systems but poses challenges due to the variety of actions and large video sequences. Deep learning (DL) systems have shown significant success in HAR, achieving high accuracies on multiple datasets. The proposed DL-based design includes feature mapping, fusion, selection steps, and outperforms state-of-the-art methods in terms of computational time.
Article
Computer Science, Artificial Intelligence
Gaurvi Goyal, Nicoletta Noceti, Francesca Odone
Summary: In this work, we propose a methodological pipeline that addresses the challenges of cross-view action recognition with a specific focus on small-scale datasets and resource efficiency. By transferring knowledge from an intermediate pre-trained representation and utilizing an effective domain adaptation strategy and a robust classifier, our approach promotes view-invariant properties and enables efficient generalization to unseen viewpoints in action recognition.
IMAGE AND VISION COMPUTING
(2022)
Article
Computer Science, Information Systems
Tej Singh, Dinesh Kumar Vishwakarma
Summary: Researchers proposed a novel framework for human activity recognition in videos, utilizing deep learning models and multimodal data fusion. By combining different modalities of information and leveraging the advantages of depth sensors, the method improves activity classification accuracy. Experimental results show that the approach outperforms existing methods on four standard RGB-D datasets.
MULTIMEDIA TOOLS AND APPLICATIONS
(2021)
Article
Computer Science, Information Systems
Muhammad Naeem Akbar, Farhan Riaz, Ahmed Bilal Awan, Muhammad Attique Khan, Usman Tariq, Saad Rehman
Summary: This study proposes an accurate human action recognition framework based on deep learning and an improved feature optimization algorithm. It achieves high accuracy in recognizing human actions on different datasets through several critical steps, including feature extraction and classification.
CMC-COMPUTERS MATERIALS & CONTINUA
(2022)
Article
Computer Science, Information Systems
Tehseen Ahsan, Sohail Khalid, Shaheryar Najam, Muhammad Attique Khan, Ye Jin Kim, Byoungchol Chang
Summary: Human action recognition (HAR) aims to understand and classify behavior. It has diverse applications in computer vision, including video surveillance. Despite challenges like similar actions and feature extraction, this research proposes an end-to-end framework using deep learning and an improved tree seed optimization algorithm. The framework involves frame preprocessing, fine-tuning pre-trained models, fusing deep learning features, optimizing fused features, and classifying them using machine learning classifiers. Experimental results on five datasets show higher accuracy compared to previous techniques.
CMC-COMPUTERS MATERIALS & CONTINUA
(2023)
Article
Computer Science, Artificial Intelligence
Norman Tasfi, Eder Santana, Luisa Liboni, Miriam Capretz
Summary: The Successor Feature framework improves task transfer in Reinforcement Learning by decomposing the state-action value function. However, the original formulation may fail due to changes in the reward function. This paper proposes the Dynamic Successor Feature framework, DynSF, which centers around a learned state-transition model and dynamically induces the acting policy. The flexibility of DynSF extends to the architecture, requiring only a state-transition model and a small vector of parameters.
KNOWLEDGE-BASED SYSTEMS
(2023)
Article
Computer Science, Information Systems
Sadia Kiran, Muhammad Attique Khan, Muhammad Younus Javed, Majed Alhaisoni, Usman Tariq, Yunyoung Nam, Robertas Damasevicius, Muhammad Sharif
Summary: A new method for Human Action Recognition (HAR) using deep learning and feature fusion techniques is proposed in this article, achieving high accuracy in several main application domains. The method is experimented on five different datasets, showing improved accuracy in action recognition.
CMC-COMPUTERS MATERIALS & CONTINUA
(2021)
Article
Computer Science, Information Systems
Mousa Alhajlah
Summary: In this paper, a novel FER framework is proposed for patient monitoring. Preprocessing and data balancing are performed, followed by training two lightweight efficient CNN models MobileNetV2 and NasNetMobile and extracting feature vectors. The WOA algorithm is used to remove irrelevant features from these vectors, and the optimized features are passed to the classifier. Experimental results show that the proposed model achieves 82.5% accuracy and outperforms state-of-the-art techniques in terms of accuracy. It is worth noting that the proposed technique achieves better accuracy with 2.8 times fewer features.
CMC-COMPUTERS MATERIALS & CONTINUA
(2023)
Article
Computer Science, Artificial Intelligence
Vittorio Mazzia, Simone Angarano, Francesco Salvetti, Federico Angelini, Marcello Chiaberge
Summary: This research introduces an attention-based Action Transformer (AcT) architecture that outperforms current mix networks in human action recognition, leveraging small temporal windows of 2D pose representations for low-latency real-time performance. Additionally, a large-scale dataset called MPOSE2021 has been open-sourced to serve as a benchmark for training and evaluating real-time, short-time HAR. The proposed methodology was extensively tested on MPOSE2021, showcasing the effectiveness of the AcT model and setting the groundwork for future HAR research.
PATTERN RECOGNITION
(2022)
Article
Computer Science, Artificial Intelligence
Fang Liu, Xiangmin Xu, Xiaofen Xing, Kailing Guo, Lin Wang
Summary: Complex action recognition is a challenging problem due to the uncontrolled nature of scenes. This paper proposes a simple-action-guided dictionary learning model (SAG-DLM) for complex action recognition, which reconstructs complex actions using learned common and difference dictionaries. Experimental results validate the effectiveness of the proposed model.
Article
Chemistry, Analytical
Jose L. Gomez, Gabriel Villalonga, Antonio M. Lopez
Summary: This paper proposes a new co-training procedure for the unsupervised domain adaptation of semantic segmentation models from synthetic to real images. The procedure involves training intermediate deep models with both synthetic and real images and iteratively labeling real-world training images. The collaboration between the models is achieved through a self-training stage and a model collaboration loop. Experimental results demonstrate significant improvements over baselines on standard synthetic and real-world datasets.
Article
Computer Science, Artificial Intelligence
Nicolae-Catalin Ristea, Andreea-Iuliana Miron, Olivian Savencu, Mariana-Iuliana Georgescu, Nicolae Verga, Fahad Shahbaz Khan, Radu Tudor Ionescu
Summary: We propose a novel approach to translate unpaired contrast CT scans to non-contrast CT scans and vice versa. Our method is based on cycle-consistent generative adversarial convolutional transformers, which can be trained on unpaired images and achieve superior results through multi-level cycle-consistency loss. We also introduce a novel dataset and show that our approach outperforms state-of-the-art methods for image style transfer in medical domain.
Review
Environmental Sciences
Abdulaziz Amer Aleissaee, Amandeep Kumar, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal, Gui-Song Xia, Fahad Shahbaz Khan
Summary: Deep learning algorithms have gained popularity in remote sensing image analysis, and transformer-based architectures have been widely used in computer vision with self-attention mechanism replacing convolution operator. Inspired by this, the remote sensing community has explored vision transformers for various tasks. This survey presents a systematic review of recent transformer-based methods in remote sensing, covering different sub-areas like very high-resolution (VHR), hyperspectral (HSI), and synthetic aperture radar (SAR) imagery. The survey concludes by discussing challenges and open issues of transformers in remote sensing.
Article
Computer Science, Artificial Intelligence
Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat, Fahad Shahbaz Khan, Huazhu Fu
Summary: This survey reviews the applications of Transformers in medical imaging, covering tasks such as medical image segmentation, detection, classification, restoration, synthesis, registration, and clinical report generation. The challenges and solutions for each application are discussed, and future research directions are highlighted. The survey aims to spark interest in the academic community and provide researchers with an up-to-date reference regarding the applications of Transformer models in medical imaging.
MEDICAL IMAGE ANALYSIS
(2023)
Article
Computer Science, Artificial Intelligence
Lu Yu, Xialei Liu, Joost van de Weijer
Summary: This paper addresses the problem of catastrophic forgetting in deep neural networks during incremental learning in class-incremental semantic segmentation. A self-training approach is proposed, leveraging unlabeled data for rehearsal of previous knowledge. Experimental results show that maximizing self-entropy and using diverse auxiliary data can significantly improve performance. State-of-the-art results are achieved on Pascal-VOC 2012 and ADE20K datasets.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Mustansar Fiaz, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan
Summary: This paper proposes a three-stage cascaded Scale-Augmented Transformer (SAT) framework for person search, which combines the benefits of convolutional neural networks and transformers. Experimental results demonstrate the favorable performance of our method compared to state-of-the-art methods on challenging datasets.
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Andreea-Iuliana Miron, Olivian Savencu, Nicolae-Catalin Ristea, Nicolae Verga, Fahad Shahbaz Khan
Summary: We propose a novel multimodal multi-head convolutional attention module for super-resolving CT and MRI scans, which outperforms state-of-the-art attention mechanisms in super-resolution. By jointly processing the CT and MRI scans in a multimodal fashion, our attention module improves the quality of super-resolution results.
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV)
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Mustansar Fiaz, Hisham Cholakkal, Sanath Narayan, Rao Muhammad Anwer, Fahad Shahbaz Khan
Summary: This paper proposes a novel attention-aware relation mixer (ARM) module for person search, which exploits the global relation between different local regions within the region of interest (RoI) of a person, making it robust against appearance deformations and background distractors.
COMPUTER VISION - ACCV 2022, PT V
(2023)
Article
Computer Science, Artificial Intelligence
Marc Masana, Xialei Liu, Bartlomiej Twardowski, Mikel Menta, Andrew D. Bagdanov, Joost van de Weijer
Summary: For future learning systems, incremental learning is desirable due to its efficient resource usage, reduced memory usage, and resemblance to human learning. The main challenge for incremental learning is catastrophic forgetting. This paper provides a comprehensive survey of existing class-incremental learning methods for image classification and performs extensive experimental evaluations on thirteen methods.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Sajid Javed, Martin Danelljan, Fahad Shahbaz Khan, Muhammad Haris Khan, Michael Felsberg, Jiri Matas
Summary: Accurate and robust visual object tracking is a challenging problem in computer vision. This survey reviews more than 90 Discriminative Correlation Filters (DCFs) and Siamese trackers, based on results in nine tracking benchmarks. It presents the background theory, research challenges, and performance analysis of both DCFs and Siamese trackers, and provides recommendations for future research.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Fatih Porikli
Summary: To address the vulnerability of CNNs to imperceptible changes in input images, an adversarial training approach is proposed. This approach creates adversarial perturbations by utilizing the style, content, and class-boundary information of target samples. A deeply supervised multi-task objective is used to extract multi-scale feature knowledge, and a max-margin adversarial training approach is applied to minimize the distance between the source image and its adversary while maximizing the distance between the adversary and the target image. This adversarial training approach demonstrates strong robustness, generalization to corruptions and data distribution shifts, and accuracy on clean examples.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Information Systems
Adel R. Alharbi, Sajjad Shaukat Jamal, Muhammad Fahad Khan, Mohammad Asif Gondal, Aaqif Afzaal Abbasi
Summary: This paper proposes an innovative approach for constructing dynamic S-boxes using Gaussian distribution-based pseudo-random sequences. The proposed technique overcomes the weaknesses of existing chaos-based S-box techniques by leveraging the strength of pseudo-randomness sequences. The technique achieves a maximum nonlinearity of 112, which is comparable to the ASE algorithm.
Proceedings Paper
Computer Science, Artificial Intelligence
Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang
Summary: Recent advances in 3D-aware generative models combined with Neural Radiance Fields have achieved impressive results in 3D consistent multi-class image-to-image translation. To address the unrealistic shape/identity change in 2D-I2I translation, the learning process is divided into a multi-class 3D-aware GAN step and a 3D-aware I2I translation step, with novel techniques proposed to reduce view-consistency problems.
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Aitor Alvarez-Gila, Joost van de Weijer, Yaxing Wang, Estibaliz Garrote
Summary: MVMO is a synthetic dataset with high object density and wide camera baselines, enabling research in multi-view semantic segmentation and cross-view semantic transfer. New research is needed to utilize the information from multi-view setups effectively.
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Vacit Oguz Yazici, Joost Van De Weijer, Longlong Yu
Summary: This paper investigates the problem of multi-label image classification and proposes an enhanced transformer model that utilizes primal object queries to improve model performance and convergence speed.
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR)
(2022)