4.6 Article

Stratified pooling based deep convolutional neural networks for human action recognition

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 76, Issue 11, Pages 13367-13382

Publisher

SPRINGER
DOI: 10.1007/s11042-016-3768-5

Keywords

Human action recognition; Convolutional neural networks (CNN); Stratified pooling (SP); Support vector machines (SVM)

Funding

  1. Nature Science Foundation of China [61202143, 61572409, 61571188]
  2. Natural Science Foundation of Fujian Province [2013J05100]
  3. Research Foundation of Education Bureau of Hunan Province [15C0726]

Ask authors/readers for more resources

Video based human action recognition is an active and challenging topic in computer vision. Over the last few years, deep convolutional neural networks (CNN) has become the most popular method and achieved the state-of-the-art performance on several datasets, such as HMDB-51 and UCF-101. Since each video has a various number of frame-level features, how to combine these features to acquire good video-level feature becomes a challenging task. Therefore, this paper proposed a novel action recognition method named stratified pooling, which is based on deep convolutional neural networks (SP-CNN). The process is mainly composed of five parts: (i) fine-tuning a pre-trained CNN on the target dataset, (ii) frame-level features extraction; (iii) the principal component analysis (PCA) method for feature dimensionality reduction; (iv) stratified pooling frame-level features to get video-level feature; and (v) SVM for multiclass classification. Finally, the experimental results conducted on HMDB-51 and UCF-101 datasets show that the proposed method outperforms the state-of-the-art.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available