Article
Robotics
Boris Ivanovic, Karen Leung, Edward Schmerling, Marco Pavone
Summary: Human behavior prediction models are crucial for designing safe and proactive robot planning algorithms, but modeling complex interaction dynamics and capturing multiple possible outcomes is challenging. The CVAE approach can generate multimodal probability distributions over future human trajectories, prompting research on various methods for human behavior prediction.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2021)
Article
Automation & Control Systems
Lei Pi, Qiang Zhang, Lingfang Yang, Zhi Huang
Summary: A Spatio-Temporal Graph Convolution Neural Network based Social Interaction Model (STGCNN-SIM) is proposed to accurately predict human trajectories by utilizing historical and speculated trajectory information to extract social interactive features and model interaction behaviors. Three social interactive features, including relative distance, angle between velocity vectors, and angles between velocity vectors and distance vector, are explicitly extracted from observed and speculated trajectories. STGCNN-SIM utilizes these features to model interactions with surroundings and employs an attention mechanism to improve the model's performance. Experimental results on three public datasets demonstrate that STGCNN-SIM achieves higher accuracy and stability compared to state-of-the-art methods.
ROBOTICS AND AUTONOMOUS SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Hao-Yun Chen, Pei-Han Huang, Li-Chen Fu
Summary: This paper proposes a hierarchical path planning algorithm that combines RGB camera and LiDAR to capture local crowd movement and predict nearby people's movement. It generates appropriate global path for the robot using crowd information and social norms. The system accurately tracks human locations and allows the robot to plan efficient and socially acceptable paths.
Article
Engineering, Multidisciplinary
Yuanman Li, Rongqin Liang, Wei Wei, Wei Wang, Jiantao Zhou, Xia Li
Summary: This paper proposes a novel method for pedestrian trajectory prediction, using a temporal pyramid network and attention mechanism to effectively model and predict complex social interactions. Experimental results demonstrate the superiority of this method.
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING
(2022)
Article
Computer Science, Artificial Intelligence
Wei Mao, Miaomiao Liu, Mathieu Salzmann, Hongdong Li
Summary: This study presents an attention-based feed-forward network for predicting future human poses by leveraging motion attention to capture similarity between current motion context and historical motion sub-sequences. Different types of attention, at joint, body part, and full pose levels, were investigated for effectively exploiting motion patterns from long-term history for pose prediction, resulting in state-of-the-art results on three datasets.
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2021)
Article
Engineering, Civil
Parth Kothari, Sven Kreiss, Alexandre Alahi
Summary: This study explores the development of human trajectory forecasting, comparing handcrafted representations with deep learning methods, and proposing two data-driven approaches to effectively capture social interactions. By establishing the TrajNet++ benchmark and introducing new performance metrics, the superiority of the proposed method on real-world and synthetic datasets is validated.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Yusheng Peng, Gaofeng Zhang, Jun Shi, Benzhu Xu, Liping Zheng
Summary: Pedestrian trajectory prediction is an important research topic in computer vision, and this paper proposes an LSTM model based on social relation attention and interaction awareness to simulate social behavior during pedestrian walking. By using social relation features and attention mechanism, more accurate trajectory prediction is achieved.
Article
Computer Science, Artificial Intelligence
Rongqin Liang, Yuanman Li, Jiantao Zhou, Xia Li
Summary: The pedestrian trajectory prediction task is crucial for intelligent systems and has wide applications. Existing approaches face limitations in accurately generating diverse trajectories, resulting in biased and inaccurate results. This article proposes a novel generative flow-based framework called STGlow, which optimizes the exact log-likelihood of motion behaviors to more precisely model the underlying data distribution. It also introduces a dual-graphormer combined with the graph structure to adequately model temporal dependencies and mutual spatial interactions. Experimental results demonstrate that our method outperforms previous state-of-the-art approaches.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Engineering, Electrical & Electronic
Yingfeng Cai, Zihao Wang, Hai Wang, Long Chen, Yicheng Li, Miguel Angel Sotelo, Zhixiong Li
Summary: This study introduces a novel Environment-Attention Network model (EA-Net) to address the challenge of modeling interaction relationships in vehicle trajectory prediction. By constructing a parallel structure of Graph Attention network and Convolutional social pooling, comprehensive and effective feature information is extracted, leading to superior prediction accuracy compared to existing models in testing scenarios.
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
(2021)
Article
Engineering, Electrical & Electronic
Kai Lv, Liang Yuan
Summary: In this article, a novel prediction model called the social knowledge-guided graph attention convolutional network (SKGACN) is proposed to address the social interactions and spatiotemporal relationships between pedestrians with low computational requirements. Experimental results show that our method performs better in terms of average displacement error (ADE) and final displacement error (FDE) metrics compared to the state-of-the-art methods.
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
(2023)
Article
Computer Science, Artificial Intelligence
Luca Rossi, Marina Paolanti, Roberto Pierdicca, Emanuele Frontoni
Summary: Human trajectory prediction is a complex subject that involves challenges such as human-space interaction, human-human interaction, multimodality, and generalizability. This study proposes new deep learning models and datasets to address these challenges and achieve better generalizability in predicting human trajectories. Experimental results demonstrate that the proposed models and datasets outperform state-of-the-art works and better capture the complexities of multimodal scenarios.
PATTERN RECOGNITION
(2021)
Article
Robotics
Lei Zhou, Dingye Yang, Xiaolin Zhai, Shichao Wu, Zhengxi Hu, Jingtai Liu
Summary: This paper proposes a novel trajectory prediction framework, GA-STT, which effectively models socially aware spatial interaction and complex temporal dependencies among groups through a group aware spatial-temporal transformer network. Experimental results demonstrate that our model outperforms the state-of-the-art method in predicting complex spatial-temporal interactions.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2022)
Article
Environmental Sciences
Yuanyuan Liu, Shaoqiang Wang, Jinghua Chen, Bin Chen, Xiaobo Wang, Dongze Hao, Leigang Sun
Summary: In this study, the researchers proposed a transformer-based model, Informer, to predict rice yield in the Indian Indo-Gangetic Plains. By integrating time-series satellite data, environmental variables, and rice yield records, Informer achieved better performance than other models for end-of-season prediction. The model was also able to achieve stable performances for within-season prediction after late September.
Article
Engineering, Civil
Guiming Sun, Heng Qi, Yanming Shen, Baocai Yin
Summary: In this paper, a temporal-context-based self-attention network named TCSA-Net is proposed, which can simultaneously exploit long-and short-term movement preferences from sparse and long trajectories. The network outperforms state-of-the-art methods in terms of standard evaluation metrics, thanks to its novel two-stage self-attention architecture and multi-modal embedding layer.
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
(2022)
Article
Engineering, Electrical & Electronic
Guohao Zhang, Penghui Xu, Haosheng Xu, Li-Ta Hsu
Summary: GNSS performance in urban canyons can be significantly degraded by interferences, but deep learning networks can predict these interferences and improve predictive accuracy. Experimental results show that the proposed deep learning networks can accurately predict satellite visibility and pseudorange errors.
IEEE SENSORS JOURNAL
(2021)
Article
Computer Science, Artificial Intelligence
Kien Nguyen, Clinton Fookes, Sridha Sridharan, Arun Ross
Summary: In this paper, we design a fully complex-valued neural network specifically for iris recognition. By capturing both phase and magnitude information, our network outperforms real-valued networks in representing the biometric content of iris texture. The experiments on benchmark datasets show that our proposed network improves the performance of iris recognition when compared to traditional methods.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Dung Nguyen, Duc Thanh Nguyen, Sridha Sridharan, Simon Denman, Thanh Thi Nguyen, David Dean, Clinton Fookes
Summary: Deep learning has made significant progress in automatic emotion recognition, but pre-trained models have limited generalization ability due to insufficient training data. To address this issue, we propose a PathNet-based meta-transfer learning method that can transfer emotional knowledge between different domains and improve emotion recognition accuracy. Experimental results show that our method outperforms existing transfer learning methods in facial expression and speech emotion recognition.
NEURAL COMPUTING & APPLICATIONS
(2023)
Article
Computer Science, Information Systems
Theekshana Dissanayake, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
Summary: Generative Adversarial Networks (GANs) are a revolutionary innovation in machine learning that enable the generation of artificial data. In the medical field, where collecting and annotating real data is difficult, artificial data synthesis is valuable. However, the capabilities of generative models for data generation, especially in biosignal modality transfer, have not been fully exploited in biomedical research. In this study, we analyze and evaluate the application of adversarial learning on biosignal data, focusing on synthesizing 1D biosignal data and modality transfer. Our results show superior performance in biosignal generation and modality transfer, making clinical monitoring more convenient for patients.
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS
(2023)
Review
Computer Science, Theory & Methods
Harshala Gammulle, David Ahmedt-Aristizabal, Simon Denman, Lachlan Tychsen-Smith, Lars Petersson, Clinton Fookes
Summary: In this paper, a comprehensive review of prediction models and action segmentation methods in video stream analysis is provided. The feature extraction and learning strategies used in state-of-the-art methods are thoroughly analyzed and compared. The impact of object detection and tracking techniques on human action segmentation is also discussed, as well as the limitations and key research directions for improving interpretability, generalization, optimization, and deployment.
ACM COMPUTING SURVEYS
(2023)
Article
Computer Science, Artificial Intelligence
Amena Khatun, Simon Denman, Sridha Sridharan, Clinton Fookes
Summary: In this paper, an end-to-end pose-driven attention-guided generative adversarial network is proposed to generate multiple poses of a person. The attention mechanism is used to learn and transfer the subject pose, and a semantic-consistency loss is proposed to preserve the semantic information during pose transfer. Appearance and pose discriminators are utilized to ensure the realism and consistency of the transferred images. Incorporating the proposed approach in a person re-identification framework achieves realistic pose transferred images and state-of-the-art re-identification results.
PATTERN RECOGNITION
(2023)
Article
Robotics
Kavisha Vidanapathirana, Peyman Moghadam, Sridha Sridharan, Clinton Fookes
Summary: This paper presents an efficient spectral method called SpectralGV for geometric verification and re-ranking. It is able to identify the correct candidate among potential matches retrieved by global similarity search without requiring resource intensive point cloud registration.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2023)
Article
Engineering, Electrical & Electronic
Theekshana Dissanayake, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
Summary: Electrocardiograms (ECGs) are a viable method for diagnosing cardiovascular diseases (CVDs). Machine learning algorithms, such as deep neural networks trained on ECG signals, have shown promising results in identifying CVDs. However, existing models for ECG anomaly detection require long training times and computational resources. To overcome this, we propose a novel deep learning architecture that utilizes dilated convolution layers, allowing for learning from short ECG segments and flexibly diagnosing CVDs.
IEEE SENSORS JOURNAL
(2023)
Article
Computer Science, Artificial Intelligence
Theekshana Dissanayake, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
Summary: In this study, a novel deep learning architecture called the multi-stage stacked TCN is proposed for biosignal segmentation and anomaly localization based on TCNs. The architecture uses multiple TCN modules with different dilation factors and employs convolution-based fusion for combining predictions. The model achieves state-of-the-art performance in five different tasks related to three 1D biosignal modalities and outperforms traditional multi-stage TCN models with similar configurations.
PATTERN RECOGNITION
(2023)
Article
Geochemistry & Geophysics
Tharindu Fernando, Clinton Fookes, Harshala Gammulle, Simon Denman, Sridha Sridharan
Summary: With advancements in low-power embedded computing devices and remote sensing instruments, the traditional satellite image processing pipeline is being replaced by on-board processing of data, enabling timely intelligence extraction on the satellite itself. The on-board processing of multispectral satellite images is limited to classification and segmentation tasks, but we aim to extend it to panoptic segmentation and evaluate the applicability of state-of-the-art models in an on-board setting. Our proposed multimodal teacher network and online knowledge distillation framework improve segmentation accuracy and demonstrate significant improvements in segmentation quality metrics for on-board processing.
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
(2023)
Article
Computer Science, Artificial Intelligence
Kien Nguyen, Tharindu Fernando, Clinton Fookes, Sridha Sridharan
Summary: Modern automated surveillance techniques rely on deep learning methods, but these methods are susceptible to adversarial attacks. Attackers can bypass detection and recognition of surveillance systems by altering their appearance or behavior, posing a threat to security. This article reviews recent attempts and findings in physical adversarial attacks on surveillance systems, and proposes strategies for defense and evaluation.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Huy Nguyen, Kien Nguyen, Sridha Sridharan, Clinton Fookes
Summary: This study proposes a new benchmark dataset, AG-ReID, for person re-identification across aerial and ground cameras. The dataset, collected by a UAV and a ground-based CCTV camera, presents a novel elevated-viewpoint challenge and employs an explainable algorithm to address it.
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME
(2023)
Proceedings Paper
Automation & Control Systems
Joshua Knights, Kavisha Vidanapathirana, Milad Ramezani, Sridha Sridharan, Clinton Fookes, Peyman Moghadam
Summary: Wild-Places is a challenging large-scale dataset specifically designed for lidar place recognition in unstructured, natural environments. It contains eight lidar sequences with a total of 63K submaps and provides accurate ground truth for both loop closure detection and re-localisation tasks.
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023)
(2023)
Article
Computer Science, Artificial Intelligence
Hamdan Abdellatef, Lina J. Karam
Summary: This paper proposes performing the learning and inference processes in the compressed domain to reduce computational complexity and improve speed of neural networks. Experimental results show that modified ResNet-50 in the compressed domain is 70% faster than traditional spatial-based ResNet-50 while maintaining similar accuracy. Additionally, a preprocessing step with partial encoding is suggested to improve resilience to distortions caused by low-quality encoded images. Training a network with highly compressed data can achieve good classification accuracy with significantly reduced storage requirements.
Article
Computer Science, Artificial Intelligence
Victor R. Barradas, Yasuharu Koike, Nicolas Schweighofer
Summary: Inverse models are essential for human motor learning as they map desired actions to motor commands. The shape of the error surface and the distribution of targets in a task play a crucial role in determining the speed of learning.
Article
Computer Science, Artificial Intelligence
Ting Zhou, Hanshu Yan, Jingfeng Zhang, Lei Liu, Bo Han
Summary: We propose a defense strategy that reduces the success rate of data poisoning attacks in downstream tasks by pre-training a robust foundation model.
Article
Computer Science, Artificial Intelligence
Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, Dacheng Tao
Summary: In this paper, the convergence rate of AdaSAM in the stochastic non-convex setting is analyzed. Theoretical proof shows that AdaSAM has a linear speedup property and decouples the stochastic gradient steps with the adaptive learning rate and perturbed gradient. Experimental results demonstrate that AdaSAM outperforms other optimizers in terms of performance.
Article
Computer Science, Artificial Intelligence
Juntong Yun, Du Jiang, Li Huang, Bo Tao, Shangchun Liao, Ying Liu, Xin Liu, Gongfa Li, Disi Chen, Baojia Chen
Summary: In this study, a dual manipulator grasping detection model based on the Markov decision process is proposed. By parameterizing the grasping detection model of dual manipulators using a cross entropy convolutional neural network and a full convolutional neural network, stable grasping of complex multiple objects is achieved. Robot grasping experiments were conducted to verify the feasibility and superiority of this method.
Article
Computer Science, Artificial Intelligence
Miaohui Zhang, Kaifang Li, Jianxin Ma, Xile Wang
Summary: This paper proposes an unsupervised person re-identification (Re-ID) method that uses two asymmetric networks to generate pseudo-labels for each other by clustering and updates and optimizes the pseudo-labels through alternate training. It also designs similarity compensation and similarity suppression based on the camera ID of pedestrian images to optimize the similarity measure. Extensive experiments show that the proposed method achieves superior performance compared to state-of-the-art unsupervised person re-identification methods.
Article
Computer Science, Artificial Intelligence
Florian Bacho, Dominique Chu
Summary: This paper proposes a new approach called the Forward Direct Feedback Alignment algorithm for supervised learning in deep neural networks. By combining activity-perturbed forward gradients, direct feedback alignment, and momentum, this method achieves better performance and convergence speed compared to other local alternatives to backpropagation.
Article
Computer Science, Artificial Intelligence
Xiaojian Ding, Yi Li, Shilin Chen
Summary: This research paper addresses the limitations of recursive feature elimination (RFE) and its variants in high-dimensional feature selection tasks. The proposed algorithms, which introduce a novel feature ranking criterion and an optimal feature subset evaluation algorithm, outperform current state-of-the-art methods.
Article
Computer Science, Artificial Intelligence
Naoko Koide-Majima, Shinji Nishimoto, Kei Majima
Summary: Visual images observed by humans can be reconstructed from brain activity, and the visualization of arbitrary natural images from mental imagery has been achieved through an improved method. This study provides a unique tool for directly investigating the subjective contents of the brain.
Article
Computer Science, Artificial Intelligence
Huanjie Tao, Qianyue Duan
Summary: In this paper, a hierarchical attention network with progressive feature fusion is proposed for facial expression recognition (FER), addressing the challenges posed by pose variation, occlusions, and illumination variation. The model achieves enhanced performance by aggregating diverse features and progressively enhancing discriminative features.
Article
Computer Science, Artificial Intelligence
Zhenyi Wang, Pengfei Yang, Linwei Hu, Bowen Zhang, Chengmin Lin, Wenkai Lv, Quan Wang
Summary: In the face of the complex landscape of deep learning, we propose a novel subgraph-level performance prediction method called SLAPP, which combines graph and operator features through an innovative graph neural network called EAGAT, providing accurate performance predictions. In addition, we introduce a mixed loss design with dynamic weight adjustment to improve predictive accuracy.
Article
Computer Science, Artificial Intelligence
Yiyang Yin, Shuangling Luo, Jun Zhou, Liang Kang, Calvin Yu-Chian Chen
Summary: Medical image segmentation is crucial for modern healthcare systems, especially in reducing surgical risks and planning treatments. Transanal total mesorectal excision (TaTME) has become an important method for treating colon and rectum cancers. Real-time instance segmentation during TaTME surgeries can assist surgeons in minimizing risks. However, the dynamic variations in TaTME images pose challenges for accurate instance segmentation.
Article
Computer Science, Artificial Intelligence
Teng Cheng, Lei Sun, Junning Zhang, Jinling Wang, Zhanyang Wei
Summary: This study proposes a scheme that combines the start-stop point signal features for wideband multi-signal detection, called Fast Spectrum-Size Self-Training network (FSSNet). By utilizing start-stop points to build the signal model, this method successfully solves the difficulty of existing deep learning methods in detecting discontinuous signals and achieves satisfactory detection speed.
Article
Computer Science, Artificial Intelligence
Wenming Wu, Xiaoke Ma, Quan Wang, Maoguo Gong, Quanxue Gao
Summary: The layer-specific modules in multi-layer networks are critical for understanding the structure and function of the system. However, existing methods fail to accurately characterize and balance the connectivity and specificity of these modules. To address this issue, a joint learning graph clustering algorithm (DRDF) is proposed, which learns the deep representation and discriminative features of the multi-layer network, and balances the connectivity and specificity of the layer-specific modules through joint learning.
Article
Computer Science, Artificial Intelligence
Guanghui Yue, Guibin Zhuo, Weiqing Yan, Tianwei Zhou, Chang Tang, Peng Yang, Tianfu Wang
Summary: This paper proposes a novel boundary uncertainty aware network (BUNet) for precise and robust colorectal polyp segmentation. BUNet utilizes a pyramid vision transformer encoder to learn multi-scale features and incorporates a boundary exploration module (BEM) and a boundary uncertainty aware module (BUM) to handle boundary areas. Experimental results demonstrate that BUNet outperforms other methods in terms of performance and generalization ability.