4.7 Article

Temporal Cross-Layer Correlation Mining for Action Recognition

Journal

IEEE TRANSACTIONS ON MULTIMEDIA
Volume 24, Issue -, Pages 668-676

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2021.3057503

Keywords

Convolution; Three-dimensional displays; Logic gates; Correlation; Trajectory; Aggregates; Training; Deep learning; video feature learning; video classification; action recognition; frame correlation mining

Ask authors/readers for more resources

In this paper, a novel method called Temporal Cross-Layer Correlation (TCLC) framework is proposed for action recognition. The framework explores temporal correlations among neighboring frames, assists cross-layer spatio-temporal feature learning, and integrates features with contextual knowledge using cross-layer attention and center-guided attention mechanism.
Neighboring frames are more correlated compared to frames from further temporal distances. In this paper, we aim to explore the temporal correlations among neighboring frames and exploit cross-layer multi-scale features for action recognition. First, we present a Temporal Cross-Layer Correlation (TCLC) framework for temporal correlation learning. The unified framework uncovers both local and global structures from video data, enabling a better exploration of temporal context and assisting cross-layer spatio-temporal feature learning. Second, we propose a novel cross-layer attention and a center-guided attention mechanism to integrate features with contextual knowledge from multiple scales. Our method is a two-stage process for effective cross-layer feature learning. The first stage incorporates the cross-layer attention module to decide the importance weight of the convolutional layers. The second stage leverages the center-guided attention mechanism to aggregate local features from each layer for the generation of a final video representation. We leverage global centers to extract shared semantic knowledge among videos. We evaluate TCLC on three action recognition datasets, i.e., UCF-101, HMDB-51 and Kinetics. Our experimental results demonstrate the superiority of our proposed temporal correlation mining method.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available