4.6 Article

Hybrid representation learning for cross-modal retrieval

期刊

NEUROCOMPUTING
卷 345, 期 -, 页码 45-57

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2018.10.082

关键词

Cross-modal retrieval; Hybrid representation; DNNs

资金

  1. National Natural Science Foundation of China [61771322, 61375015]
  2. Shenzhen Foundation fund [JCYJ20160307154630057]

向作者/读者索取更多资源

The rapid development of Deep Neural Networks (DNNs) in single-modal retrieval has promoted the wide application of DNNs in cross-modal retrieval tasks. Therefore, we propose a DNN-based method to learn the shared representation for each modality. Our method, hybrid representation learning (HRL), consists of three steps. In the first learning step, stacked restricted Boltzmann machines (SRBM) are utilized to extract the modality-friendly representation for each modality, with statistical properties that are more similar than those of the original input instances of both modalities, and a multimodal deep belief net (multimodal DBN) is utilized to extract the modality-mutual representation, which contains some missing information in the original input instances. In the second learning step, a two-level network containing a joint autoencoder and a three-layer feedforward neural net are used. From these steps, the hybrid representation is obtained, which combines the image representation constructed by the image-pathway SRBM and the modality-mutual representation, which involves the latent image representation and can be used to infer the missing values of the image via the multimodal DBN or vice-versa. In the third learning step, stacked bimodal autoencoders are used to obtain the final shared representation for each modality. The experimental results show that our proposed HRL method is superior to several advanced approaches according to three widely used cross-modal datasets. (C) 2019 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据