期刊
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
卷 31, 期 7, 页码 2751-2763出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2020.3032650
关键词
Visualization; Correlation; Proposals; Semantics; Task analysis; Explosions; Object detection; Visual relationship detection; graph neural network; label distribution
资金
- National Key Research and Development Program [2017YFB1002401]
- NSFC [61971281]
- Science and Technology Commission of Shanghai Municipality (STCSM) [18DZ2270700, 18DZ1112300]
Visual relationship detection is a challenging task that has gained much attention recently. The proposed unified framework successfully addresses the combination and label problems in object-pairs proposing and predicate recognition stages. Experimental results show that this method outperforms current state-of-the-art methods on widely used datasets.
Visual relationship detection, as a challenging task used to find and distinguish interactions between object-pairs in one image, has received much attention recently. In this work, we devise a unified visual relationship detection framework with two types of correlation exploitation to address the combination explosion problem in the object-pairs proposing stage and the non-exclusive label problem in the predicate recognition stage. In the object-pairs proposing stage, with the exploitation of relative location correlation between two objects in one pair, one location-embedded rating module (LRM) is developed to effectively select plausible proposals. In the predicate recognition stage, one label-correlation graph module (LGM) is introduced to measure the implicit semantic correlation among predicates; and then assign discrete distributed labels to predicates to improve the precision of top-n recall. Experiments on the two widely used VRD and VG datasets show that our proposed method outperforms current state-of-the-art methods.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据