期刊
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
卷 30, 期 10, 页码 3486-3498出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2019.2919139
关键词
Estimation; Feature extraction; Image segmentation; Training; Task analysis; Head; Semantics; Crowd counting; crowd analysis; spatial convolutional network; background segmentation; multi-task learning
资金
- National Natural Science Foundation of China [U1864204, 61773316, 61871470]
- Natural Science Foundation of Shaanxi Province [2018KJXX-024]
- Project of Special Zone for National Defense Science and Technology Innovation
- Key Research Program of Frontier Sciences, CAS [QYZDY-SSW-JSC044]
Crowd counting from a single image is a challenging task due to high appearance similarity, perspective changes, and severe congestion. Many methods only focus on the local appearance features and they cannot handle the aforementioned challenges. In order to tackle them, we propose a perspective crowd counting network (PCC Net), which consists of three parts: 1) density map estimation (DME) focuses on learning very local features of density map estimation; 2) random high-level density classification (R-HDC) extracts global features to predict the coarse density labels of random patches in images; and 3) fore-/background segmentation (FBS) encodes mid-level features to segments the foreground and background. Besides, the Down, Up, Left, and Right (DULR) module is embedded in PCC Net to encode the perspective changes on four directions (DULR). The proposed PCC Net is verified on five mainstream datasets, which achieves the state-of-the-art performance on the one and attains the competitive results on the other four datasets. The source code is available at https://github.com/gjy3035/PCC-Net.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据