☆ 4.6 Article

Semantically-enhanced kernel canonical correlation analysis: a multi-label cross-modal retrieval

MULTIMEDIA TOOLS AND APPLICATIONS (2019)

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

卷 78, 期 10, 页码 13169-13188

出版社

SPRINGER

DOI: 10.1007/s11042-018-5767-1

关键词

Cross-modal retrieval; Kernel CCA; Multi-label information; Concept correlations

类别

Computer Science, Information Systems Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

Natural Science Foundation of China [61571453, 61502264, 61405252]
Natural Science Foundation of Hunan Province, China [14JJ3010]
National University of Defense Technology [ZK16-03-37]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Aiming at measuring the inter-media semantic similarities, cross-modal retrieval tries to align heterogenous features to an intermediate common subspace in which they can be reasonably compared. This is based on the same understanding of the semantics which are represented by different modalities. However, the semantics can usually be reflected by multiple concepts since concepts co-occur in real-world rather than occur in isolation. This leads to a more challenging task of multi-label cross-modal retrieval in which multiple concepts are annotated as labels for images as an example. More importantly, the co-occurrence patterns of concepts result in correlated pairs of labels whose relationships need to be considered in an accurate cross-modal retrieval. In this paper, we propose multi-label kernel canonical correlation analysis (ml-KCCA), a novel approach for cross-modal retrieval which enhances kernel CCA with high-level semantic information reflected in multi-label annotations. By kernelizing correlation extraction from multi-label information, more complex non-linear correlations between different modalities can be measured in order to learn a discriminative subspace which is more suitable for cross-modal retrieval tasks. Extensive evaluations on public datasets have validated the improvements of our approach over the state-of-the-art cross-modal retrieval approaches including other CCA extensions.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

主要评分

4.6

评分不足

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

Deep adversarial multi-label cross-modal hashing algorithm

Xiaohan Yang, Zhen Wang, Wenhao Liu, Xinyi Chang, Nannan Wu

Summary: In recent years, researchers have been using hashing algorithms to improve the efficiency of large-scale cross-modal retrieval by mapping features into binary codes. However, existing cross-modal hashing algorithms often overlook the multi-label information by focusing only on single class labels. To address this issue, we propose DAMCH, a deep adversarial multi-label cross-modal hashing algorithm that considers both multi-label and deep features. Our algorithm preserves the Hamming neighbor relationship and ensures the same semantic information in binary features as in the original label. Additionally, our algorithm minimizes information loss during feature mapping and ensures consistent feature distribution across modalities. Experimental results show that DAMCH outperforms state-of-the-art methods.

INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL (2023)