☆ 4.7 Article

Spatiotemporal distilled dense-connectivity network for video action recognition

PATTERN RECOGNITION (2019)

期刊

PATTERN RECOGNITION

卷 92, 期 -, 页码 13-24

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2019.03.005

关键词

Two-stream; Action recognition; Dense-connectivity; Knowledge distillation

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

National Key R&D Program of China [2018YFB1004600]
Beijing Municipal Natural Science Foundation [Z181100008918010]
National Natural Science Foundation of China [61761146004, 61773375, 61836014]
Microsoft Collaborative Research Project

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Two-stream convolutional neural networks show great promise for action recognition tasks. However, most two-stream based approaches train the appearance and motion subnetworks independently, which may lead to the decline in performance due to the lack of interactions among two streams. To overcome this limitation, we propose a Spatiotemporal Distilled Dense-Connectivity Network (STDDCN) for video action recognition. This network implements both knowledge distillation and dense-connectivity (adapted from DenseNet). Using this STDDCN architecture, we aim to explore interaction strategies between appearance and motion streams along different hierarchies. Specifically, block-level dense connections between appearance and motion pathways enable spatiotemporal interaction at the feature representation layers. Moreover, knowledge distillation among two streams (each treated as a student) and their last fusion (treated as teacher) allows both streams to interact at the high level layers. The special architecture of STDDCN allows it to gradually obtain effective hierarchical spatiotemporal features. Moreover, it can be trained end-to-end. Finally, numerous ablation studies validate the effectiveness and generalization of our model on two benchmark datasets, including UCF101 and HMDB51. Simultaneously, our model achieves promising performances. (C) 2019 Elsevier Ltd. All rights reserved.

Spatiotemporal distilled dense-connectivity network for video action recognition

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Spatiotemporal distilled dense-connectivity network for video action recognition

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文