4.8 Article

Fourier Lucas-Kanade Algorithm

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2012.220

Keywords

Lucas & Kanade (LK); Fourier domain; illumination invariance; active appearance model (AAM)

Funding

  1. Australian Research Council [FT0991969]
  2. Australian Research Council [FT0991969] Funding Source: Australian Research Council

Ask authors/readers for more resources

In this paper, we propose a framework for both gradient descent image and object alignment in the Fourier domain. Our method centers upon the classical Lucas & Kanade (LK) algorithm where we represent the source and template/ model in the complex 2D Fourier domain rather than in the spatial 2D domain. We refer to our approach as the Fourier LK (FLK) algorithm. The FLK formulation is advantageous when one preprocesses the source image and template/ model with a bank of filters (e. g., oriented edges, Gabor, etc.) as 1) it can handle substantial illumination variations, 2) the inefficient preprocessing filter bank step can be subsumed within the FLK algorithm as a sparse diagonal weighting matrix, 3) unlike traditional LK, the computational cost is invariant to the number of filters and as a result is far more efficient, and 4) this approach can be extended to the Inverse Compositional (IC) form of the LK algorithm where nearly all steps (including Fourier transform and filter bank preprocessing) can be precomputed, leading to an extremely efficient and robust approach to gradient descent image matching. Further, these computational savings translate to nonrigid object alignment tasks that are considered extensions of the LK algorithm, such as those found in Active Appearance Models (AAMs).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Engineering, Electrical & Electronic

Channel Graph Regularized Correlation Filters for Visual Object Tracking

Monika Jain, Arjun Tyagi, A. Subramanyam, Simon Denman, Sridha Sridharan, Clinton Fookes

Summary: Explored the application of channel regularization and graph regularization methods in visual object tracking, improving the performance and discriminative power of learned filters, effectively solving the issue of uneven weight assignment to feature channels.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)

Article Computer Science, Information Systems

Robust and Interpretable Temporal Convolution Network for Event Detection in Lung Sound Recordings

Tharindu Fernando, Sridha Sridharan, Simon Denman, Houman Ghaemmaghami, Clinton Fookes

Summary: This paper introduces a novel framework for detecting lung sound events by using a multi-branch TCN architecture and feature fusion to identify discrete events in lung sound recordings. The proposed method shows promising results on multiple benchmarks, aiding in the identification of respiratory diseases. The feature concatenation strategy effectively suppresses non-informative features, leading to the construction of a lightweight network.

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS (2022)

Article Computer Science, Information Systems

Geometric Deep Learning for Subject Independent Epileptic Seizure Prediction Using Scalp EEG Signals

Theekshana Dissanayake, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes

Summary: In this study, a subject-independent seizure predictor using Geometric Deep Learning (GDL) is proposed. The models achieve state-of-the-art performance on two benchmark datasets and this is the first study that proposes synthesizing subject-specific graphs for seizure prediction. Furthermore, the model interpretation shows potential contribution of this method towards Scalp EEG-based seizure localization.

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS (2022)

Article Robotics

Elasticity Meets Continuous-Time: Map-Centric Dense 3D LiDAR SLAM

Chanoh Park, Peyman Moghadam, Jason Williams, Soohwan Kim, Sridha Sridharan, Clinton Fookes

Summary: The article introduces a novel map-centric SLAM framework, ElasticLiDAR++, which overcomes the challenges of multimodal sensor fusion and LiDAR motion distortion. Using a local continuous-time trajectory representation, the method achieves nonredundant yet dense mapping through a surface resolution preserving matching algorithm and surfel fusion model.

IEEE TRANSACTIONS ON ROBOTICS (2022)

Article Computer Science, Artificial Intelligence

Complex-Valued Iris Recognition Network

Kien Nguyen, Clinton Fookes, Sridha Sridharan, Arun Ross

Summary: In this paper, we design a fully complex-valued neural network specifically for iris recognition. By capturing both phase and magnitude information, our network outperforms real-valued networks in representing the biometric content of iris texture. The experiments on benchmark datasets show that our proposed network improves the performance of iris recognition when compared to traditional methods.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Information Systems

Generalized Generative Deep Learning Models for Biosignal Synthesis and Modality Transfer

Theekshana Dissanayake, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes

Summary: Generative Adversarial Networks (GANs) are a revolutionary innovation in machine learning that enable the generation of artificial data. In the medical field, where collecting and annotating real data is difficult, artificial data synthesis is valuable. However, the capabilities of generative models for data generation, especially in biosignal modality transfer, have not been fully exploited in biomedical research. In this study, we analyze and evaluate the application of adversarial learning on biosignal data, focusing on synthesizing 1D biosignal data and modality transfer. Our results show superior performance in biosignal generation and modality transfer, making clinical monitoring more convenient for patients.

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS (2023)

Article Computer Science, Artificial Intelligence

Pose-driven attention-guided image generation for person re-Identification

Amena Khatun, Simon Denman, Sridha Sridharan, Clinton Fookes

Summary: In this paper, an end-to-end pose-driven attention-guided generative adversarial network is proposed to generate multiple poses of a person. The attention mechanism is used to learn and transfer the subject pose, and a semantic-consistency loss is proposed to preserve the semantic information during pose transfer. Appearance and pose discriminators are utilized to ensure the realism and consistency of the transferred images. Incorporating the proposed approach in a person re-identification framework achieves realistic pose transferred images and state-of-the-art re-identification results.

PATTERN RECOGNITION (2023)

Article Robotics

Spectral Geometric Verification: Re-Ranking Point Cloud Retrieval for Metric Localization

Kavisha Vidanapathirana, Peyman Moghadam, Sridha Sridharan, Clinton Fookes

Summary: This paper presents an efficient spectral method called SpectralGV for geometric verification and re-ranking. It is able to identify the correct candidate among potential matches retrieved by global similarity search without requiring resource intensive point cloud registration.

IEEE ROBOTICS AND AUTOMATION LETTERS (2023)

Article Engineering, Electrical & Electronic

DConv-LSTM-Net: A Novel Architecture for Single- and 12-Lead ECG Anomaly Detection

Theekshana Dissanayake, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes

Summary: Electrocardiograms (ECGs) are a viable method for diagnosing cardiovascular diseases (CVDs). Machine learning algorithms, such as deep neural networks trained on ECG signals, have shown promising results in identifying CVDs. However, existing models for ECG anomaly detection require long training times and computational resources. To overcome this, we propose a novel deep learning architecture that utilizes dilated convolution layers, allowing for learning from short ECG segments and flexibly diagnosing CVDs.

IEEE SENSORS JOURNAL (2023)

Article Geochemistry & Geophysics

Toward On-Board Panoptic Segmentation of Multispectral Satellite Images

Tharindu Fernando, Clinton Fookes, Harshala Gammulle, Simon Denman, Sridha Sridharan

Summary: With advancements in low-power embedded computing devices and remote sensing instruments, the traditional satellite image processing pipeline is being replaced by on-board processing of data, enabling timely intelligence extraction on the satellite itself. The on-board processing of multispectral satellite images is limited to classification and segmentation tasks, but we aim to extend it to panoptic segmentation and evaluate the applicability of state-of-the-art models in an on-board setting. Our proposed multimodal teacher network and online knowledge distillation framework improve segmentation accuracy and demonstrate significant improvements in segmentation quality metrics for on-board processing.

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (2023)

Article Computer Science, Artificial Intelligence

Physical Adversarial Attacks for Surveillance: A Survey

Kien Nguyen, Tharindu Fernando, Clinton Fookes, Sridha Sridharan

Summary: Modern automated surveillance techniques rely on deep learning methods, but these methods are susceptible to adversarial attacks. Attackers can bypass detection and recognition of surveillance systems by altering their appearance or behavior, posing a threat to security. This article reviews recent attempts and findings in physical adversarial attacks on surveillance systems, and proposes strategies for defense and evaluation.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Aerial-Ground Person Re-ID

Huy Nguyen, Kien Nguyen, Sridha Sridharan, Clinton Fookes

Summary: This study proposes a new benchmark dataset, AG-ReID, for person re-identification across aerial and ground cameras. The dataset, collected by a UAV and a ground-based CCTV camera, presents a novel elevated-viewpoint challenge and employs an explainable algorithm to address it.

2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME (2023)

Proceedings Paper Automation & Control Systems

Wild-Places: A Large-Scale Dataset for Lidar Place Recognition in Unstructured Natural Environments

Joshua Knights, Kavisha Vidanapathirana, Milad Ramezani, Sridha Sridharan, Clinton Fookes, Peyman Moghadam

Summary: Wild-Places is a challenging large-scale dataset specifically designed for lidar place recognition in unstructured, natural environments. It contains eight lidar sequences with a total of 63K submaps and provides accurate ground truth for both loop closure detection and re-localisation tasks.

2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023) (2023)

Article Computer Science, Information Systems

Jointly Trained Conversion Model With LPCNet for Any-to-One Voice Conversion Using Speaker-Independent Linguistic Features

Ivan Himawan, Ruizhe Wang, Sridha Sridharan, Clinton Fookes

Summary: This study proposes a joint training scheme for an any-to-one voice conversion system with LPCNet to enhance the naturalness, speaker similarity, and intelligibility of converted speech. By incorporating speaker-independent features derived from an automatic speech recognition model, the conversion model accurately captures the linguistic contents of the given utterance and maps them to the acoustic representations used by LPCNet. Experimental results demonstrate that the proposed model enables real-time voice conversion and outperforms existing state-of-the-art approaches.

IEEE ACCESS (2022)

Article Computer Science, Information Systems

Deep Auto-Encoders With Sequential Learning for Multimodal Dimensional Emotion Recognition

Dung Nguyen, Duc Thanh Nguyen, Rui Zeng, Thanh Thi Nguyen, Son N. Tran, Thin Nguyen, Sridha Sridharan, Clinton Fookes

Summary: This paper proposes a novel deep neural network architecture for integrating visual and audio signal streams for emotion recognition, achieving state-of-the-art performance.

IEEE TRANSACTIONS ON MULTIMEDIA (2022)

No Data Available