4.7 Article

Indoor Crowd Counting by Mixture of Gaussians Label Distribution Learning

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2019.2922818

关键词

Label ambiguity; label distribution learning; mixture of Gaussians model

资金

  1. National Key Research & Development Plan of China [2017YFB1002801]
  2. National Science Foundation of China [61622203]
  3. Collaborative Innovation Center of Novel Software Technology and Industrialization
  4. Collaborative Innovation Center of Wireless Communications Technology

向作者/读者索取更多资源

In this paper, we tackle the problem of crowd counting in indoor videos, where people often stay almost static for a long time. The label distribution, which covers a certain number of crowd counting labels, representing the degree to which each label describes the video frame, is previously adopted to model the label ambiguity of the crowd number. However, since the label ambiguity is significantly affected by the crowd number of the scene, we initialize the label distribution of each frame by the discretized Gaussian distribution with adaptive variance instead of the original single static Gaussian distribution. Moreover, considering the gradual change of crowd numbers in the adjacent frames, a mixture of Gaussian models is proposed to generate the final label distribution representation for each frame. The weights of the Gaussian models rely on the frame and feature distances between the current frame and the adjacent frames. The mixed l(2,1)-norm is adopted to restrict the weights of predicting the adjacent crowd numbers to he locally correlated. We collect three new indoor video datasets with frame number annotation for further research. The proposed approach achieves state-of-the-art performance on seven challenging indoor videos and cross-scene experiments.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Engineering, Electrical & Electronic

MAMIQA: No-Reference Image Quality Assessment Based on Multiscale Attention Mechanism With Natural Scene Statistics

Li Yu, Junyang Li, Farhad Pakdaman, Miaogen Ling, Moncef Gabbouj

Summary: No-Reference Image Quality Assessment aims to evaluate image perceptual quality based on human perception. Many studies have used Transformers to simulate the human visual system by assigning different self-attention mechanisms to distinguish image regions. However, the quadratic computational complexity of self-attention is time-consuming and expensive. We propose a lightweight attention mechanism using decomposed large-kernel convolutions to extract multiscale features, and a novel feature enhancement module to simulate the human visual system. Additionally, we compensate for information loss caused by image resizing with supplementary features from natural scene statistics. Experimental results on five standard datasets demonstrate that our proposed method outperforms existing approaches while significantly reducing computational costs.

IEEE SIGNAL PROCESSING LETTERS (2023)

暂无数据