☆ 4.7 Article

Discriminative Dictionary Learning With Common Label Alignment for Cross-Modal Retrieval

IEEE TRANSACTIONS ON MULTIMEDIA (2016)

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

卷 18, 期 2, 页码 208-218

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2015.2508146

关键词

Common space; cross-modal retrieval; discriminative dictionary learning; label alignment

类别

Computer Science, Information Systems Computer Science, Software Engineering Telecommunications

资金

National Natural Science Foundation of China [61572388, 61125204, 61432014, 61303220]
National High Technology Research and Development Program of China [2013AA01A602]
Program for New Century Excellent Talents in University [NCET-12-0917]
Fundamental Research Funds for the Central Universities [K5051302019]
Doctoral Program of Higher Education of China [20120203120014]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Cross-modal retrieval has attracted much attention in recent years due to its widespread applications. In this area, how to capture and correlate heterogeneous features originating from different modalities remains a challenge. However, most existing methods dealing with cross-modal learning only focus on learning relevant features shared by two distinct feature spaces, therefore overlooking discriminative feature information of them. To remedy this issue and explicitly capture discriminative feature information, we propose a novel cross-modal retrieval approach based on discriminative dictionary learning that is augmented with common label alignment. Concretely, a discriminative dictionary is first learned to account for each modality, which boosts not only the discriminating capability of intra-modality data from different classes but also the relevance of inter-modality data in the same class. Subsequently, all the resulting sparse codes are simultaneously mapped to a common label space, where the cross-modal data samples are characterized and associated. Also in the label space, the discriminativeness and relevance of the considered cross-modal data can be further strengthened by enforcing a common label alignment. Finally, cross-modal retrieval is performed over the common label space. Experiments conducted on two public cross-modal datasets show that the proposed approach outperforms several state-of-the-art methods in term of retrieval accuracy.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

主要评分

4.7

评分不足

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

Semantic-enhanced discriminative embedding learning for cross-modal retrieval

Hao Pan, Jun Huang

Summary: This paper proposes a novel semantic-enhanced discriminative embedding learning method to improve the discriminative ability of cross-modal retrieval models. The method consists of three modules: attention-guided erasing, large-scale negative sampling, and weighted InfoNCE loss. Experimental results demonstrate the effectiveness of integrating these modules into existing models.

INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL (2022)