☆ 4.6 Article

Learning to rank with relational graph and pointwise constraint for cross-modal retrieval

SOFT COMPUTING (2019)

期刊

SOFT COMPUTING

卷 23, 期 19, 页码 9413-9427

出版社

SPRINGER

DOI: 10.1007/s00500-018-3608-9

关键词

Cross-modal retrieval; Ranking model; Single modality; Pointwise constraint; Interpolation algorithm

类别

Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications

资金

13th Five-Year plan for the development of philosophy and Social Sciences in GUANGZHOU [2018GZYB36]
Science Foundation of Guangdong Provincial Communications Department [2015-02-064]
National NATURAL SCIENCE Foundation of China [61402185]
South China Normal Q4 University-Bluedon Information Security Technologies Co, Ltd [LD20170201]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Cross-modal retrieval (i.e., image-query-text or text-query-image) is a hot research topic for multimedia information retrieval, but the heterogeneity gap between different modalities generates a critical challenge for multimodal data. Some researchers regard the cross-modal retrieval as a leaning to rank task, and they usually consider to measure similarity between two different modalities in the embedding shared subspace. However, previous methods almost pay more attention to construct a discriminative objective function to optimize common space, ignoring to exploit correlation between the single modality. In this paper, we consider the cross-modal retrieval task, from the perspective of optimizing ranking model, as a listwise ranking problem, and propose a novel method called learning to rank with relational graph and pointwise constraint (LR(2)GP). In LR(2)GP, we first propose a discriminative ranking model, which makes use of the relation between the single modality to improve ranking performance so as to learn an optimal embedding common subspace. Then, a pointwise constraint is introduced in the low-dimension embedding subspace to make up for the real loss in the training phase since listwise method introduced merely considers directly optimize latent permutation from the perspective of the overall. Finally, a dynamic interpolation algorithm, which gradually transits from pointwise and pairwise to listwise learning, is selected to deal with the problem of fusion of loss function reasonable. Experiments on the benchmark datasets about Wikipedia and Pascal demonstrate the effectiveness for proposed method.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

主要评分

4.6

评分不足

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

Unsupervised Cross-Modal Hashing With Modality-Interaction

Rong-Cheng Tu, Jie Jiang, Qinghong Lin, Chengfei Cai, Shangxuan Tian, Hongfa Wang, Wei Liu

Summary: In this paper, the authors propose a novel unsupervised cross-modal hashing method (UCHM) that utilizes a modality-interaction-enabled similarity generator and a bit-selection module to improve the retrieval performance of unlabeled cross-modal data.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2023)