4.7 Article

Leveraging multimodal information for event summarization and concept-level sentiment analysis

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 108, Issue -, Pages 102-109

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.knosys.2016.05.022

Keywords

Multimedia summarization; Semantics analysis; Sentics analysis; Multimodal analysis; Multimedia-related services

Funding

  1. Singapore's Ministry of Education (MOE) [T1 251RES1415]
  2. JSPS [16K16058]
  3. Grants-in-Aid for Scientific Research [16K16058] Funding Source: KAKEN

Ask authors/readers for more resources

The rapid growth in the amount of user-generated content (UGCs) online necessitates for social media companies to automatically extract knowledge structures (concepts) from photos and videos to provide diverse multimedia-related services. However, real-world photos and videos are complex and noisy, and extracting semantics and sentics from the multimedia content alone is a very difficult task because suitable concepts may be exhibited in different representations. Hence, it is desirable to analyze UGCs from multiple modalities for a better understanding. To this end, we first present the EventBuilder system that deals with semantics understanding and automatically generates a multimedia summary for a given event in real-time by leveraging different social media such as Wikipedia and Flickr. Subsequently, we present the EventSensor system that aims to address sentics understanding and produces a multimedia summary for a given mood. It extracts concepts and mood tags from visual content and textual metadata of UGCs, and exploits them in supporting several significant multimedia-related services such as a musical multimedia summary. Moreover, EventSensor supports sentics-based event summarization by leveraging EventBuilder as its semantics engine component. Experimental results confirm that both Event Builder and EventSensor outperform their baselines and efficiently summarize knowledge structures on the YFCC100M dataset. (C) 2016 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Information Systems

$\mathsf{HxL3}$: Optimized Delivery Architecture for HTTP Low-Latency Live Streaming

Farzad Tashtarian, Abdelhak Bentaleb, Alireza Erfanian, Hermann Hellwagner, Christian Timmerer, Roger Zimmermann

Summary: This paper proposes a novel architecture, HxL3, for low-latency live streaming. By implementing efficient caching and prefetching policies, HxL3 minimizes the number of live media segments, reducing rebuffering and startup delay, and achieving high-quality live streaming experiences.

IEEE TRANSACTIONS ON MULTIMEDIA (2023)

Article Computer Science, Artificial Intelligence

Evaluation of a new dataset for visual detection of cervical precancerous lesions

Ying Zhang, Yonit Zall, Ronen Nissim, Satyam, Roger Zimmermann

Summary: Automated visual evaluation (AVE) is a promising method for detecting and diagnosing cervical precancerous lesions through deep learning classifier analysis of images. The introduction of a new dataset (EVA dataset) collected using a mobile colposcope shows potential challenges for high-grade SIL diagnosis and AVE classifier development. The results suggest that a deep learning framework is effective for high-grade SIL diagnosis but improvements are needed, especially for the EVA dataset.

EXPERT SYSTEMS WITH APPLICATIONS (2022)

Article Automation & Control Systems

Towards Floor Identification and Pinpointing Position: A Multistory Localization Model with WiFi Fingerprint

Xing Zhang, Wei Sun, Jin Zheng, Min Xue, Chenjun Tang, Roger Zimmermann

Summary: This study focuses on indoor WiFi fingerprint localization and proposes a floor identification module and a fingerprint graph attention mechanism. By comprehensively analyzing fingerprint attributes and using a two-panel fingerprint homogeneity graph, the experimental results show that the proposed method achieves better performance in floor identification and 2-D geometric positioning.

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS (2022)

Article Computer Science, Information Systems

Variational Autoencoder with CCA for Audio-Visual Cross-modal Retrieval

Jiwei Zhang, Yi Yu, Suhua Tang, Jianming Wu, Wei Li

Summary: Cross-modal retrieval is a popular topic in information retrieval, machine learning, and databases. The major challenge is to measure the similarity between different modality data effectively. Current methods struggle to extract features from multi-modal information. In this article, we propose a novel variational autoencoder architecture that improves the performance of audio-visual cross-modal retrieval.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Article Computer Science, Information Systems

Melody Generation from Lyrics with Local Interpretability

Wei Duan, Yi Yu, Xulong Zhang, Suhua Tang, Wei Li, Keizo Oyama

Summary: This article proposes a model for melody generation from lyrics with local interpretability, which enhances the understanding of the relationship between input lyrics and generated melodies.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Editorial Material Computer Science, Information Systems

Special issue on human-centric intelligent multimedia understanding

Zhenguang Liu, Roger Zimmermann, Li Cheng

MULTIMEDIA SYSTEMS (2023)

Article Chemistry, Analytical

Multi-Task Partial Offloading with Relay and Adaptive Bandwidth Allocation for the MEC-Assisted IoT

Hafiz Hasnain Imtiaz, Suhua Tang

Summary: The fifth-generation (5G) wireless network enables low latency services in Internet of Things (IoT) networks. However, IoT nodes lack computational capabilities for real-time complex tasks. To address this issue, multi-access edge computing (MEC) allows IoT nodes to offload their computational tasks to MEC servers. This paper proposes a method that combines relay selection and adaptive bandwidth allocation to improve the efficiency of multi-task partial offloading in IoT networks. Simulation results show that the proposed method outperforms other methods without these functions or with only one of them.

SENSORS (2023)

Article Transportation

A Random Effect Bayesian Neural Network (RE-BNN) for travel mode choice analysis across multiple regions

Yutong Xia, Huanfa Chen, Roger Zimmermann

Summary: This study proposes a framework of Random Effect-Bayesian Neural Network (RE-BNN) for predicting and explaining travel mode choice across multiple regions. The results show that this model outperforms the plain Deep Neural Network (DNN) in terms of prediction accuracy and is more robust across different datasets. Additionally, the capability of the RE-BNN model to learn travel behaviors across regions is demonstrated through offset utilities, choice probability functions, and local travel mode shares.

TRAVEL BEHAVIOUR AND SOCIETY (2023)

Article Chemistry, Analytical

Relay Selection for Over-the-Air Computation Achieving Both Long Lifetime and High Reliability

Jingyang Zhou, Suhua Tang

Summary: In a wireless sensor network, conventional methods for data collection and computation have scalability issues and transmission collisions. Using over-the-air computation (AirComp) can efficiently perform data collection and computation, but it has problems with low channel gain and computation errors. To solve these problems, this paper investigates relay communication for AirComp and proposes a relay selection protocol. The proposed method helps to prolong network lifetime and reduce computation errors.

SENSORS (2023)

Article Computer Science, Artificial Intelligence

Mixed-Order Relation-Aware Recurrent Neural Networks for Spatio-Temporal Forecasting

Yuxuan Liang, Kun Ouyang, Yiwei Wang, Zheyi Pan, Yifang Yin, Hongyang Chen, Junbo Zhang, Yu Zheng, David S. Rosenblum, Roger Zimmermann

Summary: Spatio-temporal forecasting has various applications in smart cities, but the state-of-the-art method, GCRNN, fails to consider higher-order spatial relations and underlying physics in real-world systems. Therefore, we propose MixRNN+, a general model that captures complex spatial relations and addresses underlying physics, for spatio-temporal forecasting. Experimental results on three forecasting tasks demonstrate the superiority of MixRNN+ against existing methods, and a cloud-based system using MixRNN+ as the bedrock model showcases its practicality.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2023)

Article Engineering, Electrical & Electronic

Multi-Slot Over-the-Air Computation in Fading Channels

Suhua Tang, Petar Popovski, Chao Zhang, Sadao Obana

Summary: This paper proposes a multi-slot over-the-air computation (MS-AirComp) framework for fading channels in IoT systems, which improves channel gains and reduces signal distortion by utilizing multiple slots. The closed-form of the computation error is derived through theoretical analysis, and optimal parameters are found. Simulations show that the proposed method effectively reduces computation error.

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS (2023)

Article Computer Science, Information Systems

Exploiting Phase Difference of Arrival of V2X Signals for Pedestrian Positioning: Key Methods and Simulation Evaluation

Suhua Tang, Sadao Obana

Summary: Pedestrian-to-vehicle communication is crucial for preventing pedestrian accidents, especially when pedestrians are in blind spots. However, in urban canyons, buildings obstruct satellite signals, leading to interruptions in pedestrian positioning. This paper proposes using vehicles and roadside units as positioning anchors to address this issue.

IEEE ACCESS (2023)

Article Computer Science, Artificial Intelligence

Decoupling Long-and Short-Term Patterns in Spatiotemporal Inference

Junfeng Hu, Yuxuan Liang, Zhencheng Fan, Li Liu, Yifang Yin, Roger Zimmermann

Summary: Sensors are crucial for environmental monitoring in smart cities, but it is impractical to deploy massive sensors due to high costs, resulting in sparse data collection. This article focuses on inferring values at nonsensor locations based on observations from available sensors (spatiotemporal inference) by capturing relationships among the data. The investigations reveal distinct patterns at both long and short-term temporal scales, and propose decoupling the modeling of short and long-term patterns. Experimental results demonstrate the effectiveness of the proposed method in capturing both long and short-term relations.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Proceedings Paper Computer Science, Artificial Intelligence

FaceLivePlus: A Unified System for Face Liveness Detection and Face Verification

Ying Zhang, Lilei Zheng, Vrizlynn L. L. Thing, Roger Zimmermann, Bin Guo, Zhiwen Yu

Summary: Face verification is commonly used to verify someone's identity, but it can be vulnerable to face spoofing attacks. To enhance security and reduce computational and storage costs, a new system has been developed that learns a single and universal face descriptor for both face verification and liveness detection.

PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023 (2023)

Proceedings Paper Computer Science, Artificial Intelligence

A Multi-Teacher Assisted Knowledge Distillation Approach for Enhanced Face Image Authentication

Tiancong Cheng, Ying Zhang, Yifang Yin, Roger Zimmermann, Zhiwen Yu, Bin Guo

Summary: This paper proposes a compressed multitask model that performs face recognition and face anti-spoofing tasks simultaneously in a lightweight manner, reducing the redundancy of the original dual-model. By using a multi-teacher-assisted knowledge distillation method and feature alignment, satisfying performance is achieved with significant reductions in model size and inference time.

PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023 (2023)

Article Computer Science, Artificial Intelligence

Confidence-based and sample-reweighted test-time adaptation

Hao Yang, Min Wang, Zhengfei Yu, Hang Zhang, Jinshen Jiang, Yun Zhou

Summary: In this paper, a novel method called CSTTA is proposed for test time adaptation (TTA), which utilizes confidence-based optimization and sample reweighting to better utilize sample information. Extensive experiments demonstrate the effectiveness of the proposed method.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

A novel method for generating a canonical basis for decision implications based on object-induced three-way operators

Jin Liu, Ju-Sheng Mi, Dong-Yun Niu

Summary: This article focuses on a novel method for generating a canonical basis for decision implications based on object-induced operators (OE operators). The logic of decision implication based on OE operators is described, and a method for obtaining the canonical basis for decision implications is given. The completeness, nonredundancy, and optimality of the canonical basis are proven. Additionally, a method for generating true premises based on OE operators is proposed.

KNOWLEDGE-BASED SYSTEMS (2024)

Review Computer Science, Artificial Intelligence

Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning

Kun Bu, Yuanchao Liu, Xiaolong Ju

Summary: This paper discusses the importance of sentiment analysis and pre-trained models in natural language processing, and explores the application of prompt learning. The research shows that prompt learning is more suitable for sentiment analysis tasks and can achieve good performance.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

M-EDEM: A MNN-based Empirical Decomposition Ensemble Method for improved time series forecasting

Xiangjun Cai, Dagang Li

Summary: This paper presents a new decomposition mechanism based on learned decomposition mapping. By using a neural network to learn the relationship between original time series and decomposed results, the repetitive computation overhead during rolling decomposition is relieved. Additionally, extended mapping and partial decomposition methods are proposed to alleviate boundary effects on prediction performance. Comparative studies demonstrate that the proposed method outperforms existing RDEMs in terms of operation speed and prediction accuracy.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

Privacy-preserving trust management method based on blockchain for cross-domain industrial IoT

Xu Wu, Yang Liu, Jie Tian, Yuanpeng Li

Summary: This paper proposes a blockchain-based privacy-preserving trust management architecture, which adopts federated learning to train task-specific trust models and utilizes differential privacy to protect device privacy. In addition, a game theory-based incentive mechanism and a parallel consensus protocol are proposed to improve the accuracy of trust computing and the efficiency of consensus.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

MV-ReID: 3D Multi-view Transformation Network for Occluded Person Re-Identification

Zaiyang Yu, Prayag Tiwari, Luyang Hou, Lusi Li, Weijun Li, Limin Jiang, Xin Ning

Summary: This study introduces a 3D view-based approach that effectively handles occlusions and leverages the geometric information of 3D objects. The proposed method achieves state-of-the-art results on occluded ReID tasks and exhibits competitive performance on holistic ReID tasks.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

City-scale continual neural semantic mapping with three-layer sampling and panoptic representation

Yongliang Shi, Runyi Yang, Zirui Wu, Pengfei Li, Caiyun Liu, Hao Zhao, Guyue Zhou

Summary: Neural implicit representations have gained attention due to their expressive, continuous, and compact properties. However, there is still a lack of research on city-scale continual implicit dense mapping based on sparse LiDAR input. In this study, a city-scale continual neural mapping system with a panoptic representation is developed, incorporating environment-level and instance-level modeling. A tailored three-layer sampling strategy and category-specific prior are proposed to address the challenges of representing geometric information in city-scale space and achieving high fidelity mapping of instances under incomplete observation.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

MDSSN: An end-to-end deep network on triangle mesh parameterization

Ruihan Hu, Zhi-Ri Tang, Rui Yang, Zhongjie Wang

Summary: Mesh data is crucial for 3D computer vision applications worldwide, but traditional deep learning frameworks have struggled with handling meshes. This paper proposes MDSSN, a simple mesh computation framework that models triangle meshes and represents their shape using face-based and edge-based Riemannian graphs. The framework incorporates end-to-end operators inspired by traditional deep learning frameworks, and includes dedicated modules for addressing challenges in mesh classification and segmentation tasks. Experimental results demonstrate that MDSSN outperforms other state-of-the-art approaches.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

Semi-supervised learning with missing values imputation

Buliao Huang, Yunhui Zhu, Muhammad Usman, Huanhuan Chen

Summary: This paper proposes a novel semi-supervised conditional normalizing flow (SSCFlow) algorithm that combines unsupervised imputation and supervised classification. By estimating the conditional distribution of incomplete instances, SSCFlow facilitates imputation and classification simultaneously, addressing the issue of separated tasks ignoring data distribution and label information in traditional methods.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

Emotion-and-knowledge grounded response generation in an open-domain dialogue setting

Deeksha Varshney, Asif Ekbal, Erik Cambria

Summary: This paper focuses on the neural-based interactive dialogue system that aims to engage and retain humans in long-lasting conversations. It proposes a new neural generative model that combines step-wise co-attention, self-attention-based transformer network, and an emotion classifier to control emotion and knowledge transfer during response generation. The results from quantitative, qualitative, and human evaluation show that the proposed models can generate natural and coherent sentences, capturing essential facts with significant improvement over emotional content.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

MvTS-library: An open library for deep multivariate time series forecasting

Junchen Ye, Weimiao Li, Zhixin Zhang, Tongyu Zhu, Leilei Sun, Bowen Du

Summary: Modeling multivariate time series has long been a topic of interest for scholars in various fields. This paper introduces MvTS, an open library based on Pytorch, which provides a unified framework for implementing and evaluating these models. Extensive experiments on public datasets demonstrate the effectiveness and universality of the models reproduced by MvTS.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

An adaptive hybrid mutated differential evolution feature selection method for low and high-dimensional medical datasets

Reham R. Mostafa, Ahmed M. Khedr, Zaher Al Aghbari, Imad Afyouni, Ibrahim Kamel, Naveed Ahmed

Summary: Feature selection is crucial in classification procedures, but it faces challenges in high-dimensional datasets. To overcome these challenges, this study proposes an Adaptive Hybrid-Mutated Differential Evolution method that incorporates the mechanics of the Spider Wasp Optimization algorithm and the concept of Enhanced Solution Quality. Experimental results demonstrate the effectiveness of the method in terms of accuracy and convergence speed, and it outperforms contemporary cutting-edge algorithms.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

TCM Model for improving track sequence classification in real scenarios with Multi-Feature Fusion and Transformer Block

Ti Xiang, Pin Lv, Liguo Sun, Yipu Yang, Jiuwu Hao

Summary: This paper introduces a Track Classification Model (TCM) based on marine radar, which can effectively recognize and classify shipping tracks. By using a feature extraction network with multi-feature fusion and a dataset production method to address missing labels, the classification accuracy is improved, resulting in successful engineering application in real scenarios.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

Language model as an Annotator: Unsupervised context-aware quality phrase generation

Zhihao Zhang, Yuan Zuo, Chenghua Lin, Junjie Wu

Summary: This paper proposes a novel unsupervised context-aware quality phrase mining framework called LMPhrase, which is built upon large pre-trained language models. The framework mines quality phrases as silver labels using a parameter-free probing technique on the pre-trained language model BERT, and formalizes the phrase tagging task as a sequence generation problem by fine-tuning on the Sequence to-Sequence pre-trained language model BART. The results of extensive experiments show that LMPhrase consistently outperforms existing competitors in two different granularity phrase mining tasks.

KNOWLEDGE-BASED SYSTEMS (2024)

Article Computer Science, Artificial Intelligence

Stochastic Gradient Descent for matrix completion: Hybrid parallelization on shared- and distributed-memory systems

Kemal Buyukkaya, M. Ozan Karsavuran, Cevdet Aykanat

Summary: The study aims to investigate the hybrid parallelization of the Stochastic Gradient Descent (SGD) algorithm for solving the matrix completion problem on a high-performance computing platform. A hybrid parallel decentralized SGD framework with asynchronous inter-process communication and a novel flexible partitioning scheme is proposed to achieve scalability up to hundreds of processors. Experimental results on real-world benchmark datasets show that the proposed algorithm achieves 6x higher throughput on sparse datasets compared to the state-of-the-art, while achieving comparable throughput on relatively dense datasets.

KNOWLEDGE-BASED SYSTEMS (2024)