☆ 4.6 Article

Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems

COMPUTER VISION AND IMAGE UNDERSTANDING (2014)

期刊

COMPUTER VISION AND IMAGE UNDERSTANDING

卷 124, 期 -, 页码 123-135

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

DOI: 10.1016/j.cviu.2014.03.003

关键词

Content-based image retrieval; Query-by-example; Domain adaptation; Semantic representation; Cross-modal regularization; Class-specific regularization

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

FCT graduate Fellowship from the Portuguese Ministry of Sciences and Education [SFRH/BD/40963/2007]
NSF [CCF-0830535]
Direct For Computer & Info Scie & Enginr
Division of Computing and Communication Foundations [0830535] Funding Source: National Science Foundation
Fundação para a Ciência e a Tecnologia [SFRH/BD/40963/2007] Funding Source: FCT

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In query-by-semantic-example image retrieval, images are ranked by similarity of semantic descriptors. These descriptors are obtained by classifying each image with respect to a pre-defined vocabulary of semantic concepts. In this work, we consider the problem of improving the accuracy of semantic descriptors through cross-modal regularization, based on auxiliary text. A cross-modal regularizer, composed of three steps, is proposed. Training images and text are first mapped to a common semantic space. A regularization operator is then learned for each concept in the semantic vocabulary. This is an operator which maps the semantic descriptors of images labeled with that concept to the descriptors of the associated texts. A convex formulation of the learning problem is introduced, enabling the efficient computation of concept-specific regularization operators. The third step is the selection of the most suitable operator for the image to regularize. This is implemented through a quantization of the semantic space, where a regularization operator is associated with each quantization cell. Overall, the proposed regularizer is a non-linear mapping, implemented as a piecewise linear transformation of the semantic image descriptors to regularize. This transformation is a form of cross-modal domain adaptation. It is shown to achieve better performance than recent proposals in the domain adaptation literature, while requiring much simpler optimization. (C) 2014 Elsevier Inc. All rights reserved.

Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems

期刊

COMPUTER VISION AND IMAGE UNDERSTANDING

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems

期刊

COMPUTER VISION AND IMAGE UNDERSTANDING

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文