期刊
IEEE TRANSACTIONS ON MULTIMEDIA
卷 16, 期 4, 页码 1059-1074出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2014.2306655
关键词
Canonical correlation analysis; concept relationships; flickr; tagged images
资金
- Japan Society for the Promotion of Science (JSPS) [25 . 1688]
- Grants-in-Aid for Scientific Research [13J01688] Funding Source: KAKEN
This paper presents a cross-modal approach for extracting semantic relationships between concepts using tagged images. In the proposed method, we first project both text and visual features of the tagged images to a latent space using canonical correlation analysis (CCA). Then, under the probabilistic interpretation of CCA, we calculate a representative distribution of the latent variables for each concept. Based on the representative distributions of the concepts, we derive two types of measures: the semantic relatedness between the concepts and the abstraction level of each concept. Because these measures are derived from a cross-modal scheme that enables the collaborative use of both text and visual features, the semantic relationships can successfully reflect semantic and visual contexts. Experiments conducted on tagged images collected from Flickr show that our measures are more coherent to human cognition than the conventional measures that use either text or visual features, or the WordNet-based measures. In particular, a new measure of semantic relatedness, which satisfies the triangle inequality, obtains the best results among different distance measures in our framework. The applicability of our measures to multimedia-related tasks such as concept clustering, image annotation and tag recommendation is also shown in the experiments.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据