☆ 4.6 Article

Hybrid representation learning for cross-modal retrieval

NEUROCOMPUTING (2019)

期刊

NEUROCOMPUTING

卷 345, 期 -, 页码 45-57

出版社

ELSEVIER

DOI: 10.1016/j.neucom.2018.10.082

关键词

Cross-modal retrieval; Hybrid representation; DNNs

类别

Computer Science, Artificial Intelligence

资金

National Natural Science Foundation of China [61771322, 61375015]
Shenzhen Foundation fund [JCYJ20160307154630057]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The rapid development of Deep Neural Networks (DNNs) in single-modal retrieval has promoted the wide application of DNNs in cross-modal retrieval tasks. Therefore, we propose a DNN-based method to learn the shared representation for each modality. Our method, hybrid representation learning (HRL), consists of three steps. In the first learning step, stacked restricted Boltzmann machines (SRBM) are utilized to extract the modality-friendly representation for each modality, with statistical properties that are more similar than those of the original input instances of both modalities, and a multimodal deep belief net (multimodal DBN) is utilized to extract the modality-mutual representation, which contains some missing information in the original input instances. In the second learning step, a two-level network containing a joint autoencoder and a three-layer feedforward neural net are used. From these steps, the hybrid representation is obtained, which combines the image representation constructed by the image-pathway SRBM and the modality-mutual representation, which involves the latent image representation and can be used to infer the missing values of the image via the multimodal DBN or vice-versa. In the third learning step, stacked bimodal autoencoders are used to obtain the final shared representation for each modality. The experimental results show that our proposed HRL method is superior to several advanced approaches according to three widely used cross-modal datasets. (C) 2019 Elsevier B.V. All rights reserved.

Hybrid representation learning for cross-modal retrieval

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Hybrid representation learning for cross-modal retrieval

期刊

NEUROCOMPUTING

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文