☆ 4.7 Article

Leveraging multimodal information for event summarization and concept-level sentiment analysis

KNOWLEDGE-BASED SYSTEMS (2016)

Journal

KNOWLEDGE-BASED SYSTEMS

Volume 108, Issue -, Pages 102-109

Publisher

ELSEVIER SCIENCE BV

DOI: 10.1016/j.knosys.2016.05.022

Keywords

Multimedia summarization; Semantics analysis; Sentics analysis; Multimodal analysis; Multimedia-related services

Funding

Singapore's Ministry of Education (MOE) [T1 251RES1415]
JSPS [16K16058]
Grants-in-Aid for Scientific Research [16K16058] Funding Source: KAKEN

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The rapid growth in the amount of user-generated content (UGCs) online necessitates for social media companies to automatically extract knowledge structures (concepts) from photos and videos to provide diverse multimedia-related services. However, real-world photos and videos are complex and noisy, and extracting semantics and sentics from the multimedia content alone is a very difficult task because suitable concepts may be exhibited in different representations. Hence, it is desirable to analyze UGCs from multiple modalities for a better understanding. To this end, we first present the EventBuilder system that deals with semantics understanding and automatically generates a multimedia summary for a given event in real-time by leveraging different social media such as Wikipedia and Flickr. Subsequently, we present the EventSensor system that aims to address sentics understanding and produces a multimedia summary for a given mood. It extracts concepts and mood tags from visual content and textual metadata of UGCs, and exploits them in supporting several significant multimedia-related services such as a musical multimedia summary. Moreover, EventSensor supports sentics-based event summarization by leveraging EventBuilder as its semantics engine component. Experimental results confirm that both Event Builder and EventSensor outperform their baselines and efficiently summarize knowledge structures on the YFCC100M dataset. (C) 2016 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Topic-aware video summarization using multimodal transformer

Yubo Zhu, Wentian Zhao, Rui Hua, Xinxiao Wu

Summary: Video summarization is the task of generating a concise and compact summary to represent the original video. Existing methods focus on extracting objective summaries that accurately summarize the video content. However, videos often contain diverse content with multiple topics, and people may have different interests in the visual contents of the same video. In this paper, we propose a novel topic-aware video summarization task that generates multiple video summaries with different topics. We build a benchmark dataset and propose a multimodal Transformer model to address this task, achieving effective results.

PATTERN RECOGNITION (2023)