☆ 4.7 Article

Explaining VQA predictions using visual grounding and a knowledge base

IMAGE AND VISION COMPUTING (2020)

期刊

IMAGE AND VISION COMPUTING

卷 101, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.imavis.2020.103968

关键词

Deep Learning; Attention; Supervision; Knowledge Base; Interpretability; Explainability

类别

Computer Science, Artificial Intelligence Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic Optics

资金

Fondecyt Grant, Chile [1181739]
Millennium Institute for Foundational Research on Data, Chile

向作者/读者索取更多资源

Protocol

Reagent

摘要

In this work, we focus on the Visual Question Answering (VQA) task, where a model must answer a question based on an image, and the VQA-Explanations task, where an explanation is produced to support the answer. We introduce an interpretable model capable of pointing out and consuming information from a novel Knowledge Base (KB) composed of real-world relationships between objects, along with labels mined from available region descriptions and object annotations. Furthermore, this model provides a visual and textual explanations to complement the KB visualization. The use of a KB brings two important consequences: enhance predictions and improve interpretability. We achieve this by introducing a mechanism that can extract relevant information from this KB, and can point out the relations better suited for predicting the answer. A supervised attention map is generated over the KB to select the relevant relationships from it for each question-image pair. Moreover, we add image attention supervision on the explanations module to generate better visual and textual explanations. We quantitatively show that the predicted answers improve when using the KB; similarly, explanations improve with this and when adding image attention supervision. Also, we qualitatively show that the KB attention helps to improve interpretability and enhance explanations. Overall, the results support the benefits of having multiple tasks to enhance the interpretability and performance of the model. (C) 2020 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Theory & Methods

Explainable Deep Reinforcement Learning: State of the Art and Challenges

George A. Vouros

Summary: Interpretability, explainability, and transparency are crucial factors in the implementation of artificial intelligence methods in various critical domains. This article provides a review of state-of-the-art methods for explainable deep reinforcement learning, with a focus on meeting the needs of human operators.

ACM COMPUTING SURVEYS (2023)

添加到收藏夹

Article Automation & Control Systems

Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning

Qisen Yang, Huanqian Wang, Mukun Tong, Wenjie Shi, Gao Huang, Shiji Song

Summary: This paper discusses the interpretation and explanation of deep reinforcement learning (RL) agents and proposes a method that considers the rewards in feature discovery. A novel framework is introduced to solve the gradient disconnection problem, and the method is evaluated on multiple experiments.

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Explainable artificial intelligence (XAI): Precepts, models, and opportunities for research in construction

Peter E. D. Love, Weili Fang, Jane Matthews, Stuart Porter, Hanbin Luo, Lieyun Ding

Summary: Machine learning (ML) and deep learning (DL) are branches of AI. ML is a form of AI that automatically adapts to changing datasets with minimal human interference. DL, a subset of ML, uses artificial neural networks to imitate the learning process of the human brain. However, understanding the inner workings of ML and DL models can be difficult due to their 'black box' nature. Explainable artificial intelligence (XAI) can help in explaining and interpreting the outputs of ML and DL models, reducing bias and error and improving confidence in decision-making. This paper presents a narrative review of XAI and a taxonomy of precepts and models in order to raise awareness about its potential opportunities for use in the construction industry.

ADVANCED ENGINEERING INFORMATICS (2023)

添加到收藏夹

Article Computer Science, Information Systems

An Interpretable Deep Learning Model for Automatic Sound Classification

Pablo Zinemanas, Martin Rocamora, Marius Miron, Frederic Font, Xavier Serra

Summary: Deep learning models have advanced in many research areas, but their black-box structure limits understanding of their inner workings and predictions. A new interpretable deep learning model is proposed for automatic sound classification, explaining predictions based on similarities to learned prototypes and leveraging domain knowledge. The model achieves comparable results to state-of-the-art methods in speech, music, and environmental audio classification tasks, with automatic pruning methods available for interpretability.

ELECTRONICS (2021)

添加到收藏夹

Review Radiology, Nuclear Medicine & Medical Imaging

A review of explainable and interpretable AI with applications in COVID-19 imaging

Jordan D. Fuhrman, Naveena Gorre, Qiyuan Hu, Hui Li, Issam El Naqa, Maryellen L. Giger

Summary: The development of medical imaging AI for evaluating COVID-19 patients shows potential in enhancing clinical decision making, with developers utilizing explainability techniques to increase user trust and clinical translation potential.

MEDICAL PHYSICS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Towards Improved and Interpretable Deep Metric Learning via Attentive Grouping

Xinyi Xu, Zhengyang Wang, Cheng Deng, Hao Yuan, Shuiwang Ji

Summary: This study proposes an improved and interpretable grouping method to enhance the performance and interpretability of deep metric learning. The method utilizes attention mechanism with learnable queries to capture group-specific information. The results demonstrate that the proposed method consistently outperforms prior methods across various evaluation metrics, datasets, base models, and loss functions.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Explainability of deep learning models in medical video analysis: a survey

Michal Kolarik, Martin Sarnovsky, Jan Paralic, Frantisek Babic

Summary: Deep learning has shown effectiveness in medical diagnostic tasks compared to traditional machine learning methods, but its black-box nature hinders its real-world applications, especially in healthcare. Explainability of machine learning models is crucial for their adoption in clinical use, and this article reviews the approaches and applications of explainable deep learning in the specific area of medical video processing tasks. The article introduces the field of explainable AI, summarizes the important requirements for explainability in medical applications, provides an overview of existing methods and evaluation metrics, and focuses on those applicable to video data analysis in the medical domain. It also identifies open research issues in this area.

PEERJ COMPUTER SCIENCE (2023)

添加到收藏夹

Article Environmental Sciences

Unboxing the Black Box of Attention Mechanisms in Remote Sensing Big Data Using XAI

Erfan Hasanpour Zaryabi, Loghman Moradi, Bahareh Kalantar, Naonori Ueda, Alfian Abdul Halin

Summary: This paper explores the effectiveness of attention mechanisms in improving CNN-based building segmentation, and interprets the results using explainable AI methods. The experimental results show that attention mechanisms greatly enhance the quantitative metrics, which is consistent with the attribution visualization results from XAI methods.

REMOTE SENSING (2022)

添加到收藏夹

Article Agronomy

Explainable Deep Learning Study for Leaf Disease Classification

Kaihua Wei, Bojian Chen, Jingcheng Zhang, Shanhui Fan, Kaihua Wu, Guangyu Liu, Dongmei Chen

Summary: Explainable artificial intelligence has been extensively studied, but its research in interpretable methods for the agricultural field is lacking. This study examines the interpretability of deep learning models in agricultural classification tasks using a fruit leaves dataset. The results show that the ResNet model has the highest accuracy rate in the experiments and that the attention module improves the feature extraction and helps clarify the model's focus in different experiments.

AGRONOMY-BASEL (2022)

添加到收藏夹

Article Computer Science, Interdisciplinary Applications

Interpretable deep learning approach for tool wear monitoring in high-speed milling

Hao Guo, Yu Zhang, Kunpeng Zhu

Summary: This study proposes a multi-scale pyramid attention network (MPAN) for tool wear monitoring. The method can accurately monitor tool wear and provide interpretability from the aspects of network structure design and feature extraction. Experimental results demonstrate the effectiveness and feasibility of this method.

COMPUTERS IN INDUSTRY (2022)

添加到收藏夹

Article Chemistry, Multidisciplinary

TorchEsegeta: Framework for Interpretability and Explainability of Image-Based Deep Learning Models

Soumick Chatterjee, Arnab Das, Chirag Mandal, Budhaditya Mukhopadhyay, Manish Vipinraj, Aniruddh Shukla, Rajatha Nagaraja Rao, Chompunuch Sarasaen, Oliver Speck, Andreas Nuernberger

Summary: This paper presents a method for interpreting and explaining the results of deep learning algorithms and proposes a unified framework for generating visual interpretations for clinicians. The framework extends existing interpretability techniques and allows for quantitative comparison of visual explanations.

APPLIED SCIENCES-BASEL (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Understanding Deep Learning via Decision Boundary

Shiye Lei, Fengxiang He, Yancheng Yuan, Dacheng Tao

Summary: This article finds that neural networks with less variability in decision boundaries have better generalizability. The experiments show significant negative correlations between decision boundary variability and generalizability. The article introduces the concepts of algorithm DB variability and (epsilon, eta)-data DB variability to measure variability in decision boundaries.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

nantes.fr

Tristan Gomez, Suiyi Ling, Thomas Freour, Harold Mouchere

Summary: The prevalence of attention mechanisms has raised concerns about their interpretability. In order to improve the interpretability of attention models, a novel strategy called Bilinear Representative Non-Parametric Attention (BR-NPA) is proposed. This strategy captures task-relevant human-interpretable information and provides more comprehensive and accurate visual explanations compared to state-of-the-art attention models and visualization methods.

PATTERN RECOGNITION (2022)

添加到收藏夹

Review Physiology

From What to Why, the Growing Need for a Focus Shift Toward Explainability of AI in Digital Pathology

Samuel P. Border, Pinaki Sarder

Summary: This article points out that although there have been significant performance gains in pathology due to deep learning and artificial intelligence techniques, little research has been done to answer the crucial question of why these algorithms make predictions. Tracing classification decisions back to specific input features allows for the identification of model bias and provides additional information for understanding underlying biological mechanisms. In digital pathology, increasing the explainability of AI models would have the largest and most immediate impact on image classification tasks. The article details some considerations that should be made in order to develop models with a focus on explainability.

FRONTIERS IN PHYSIOLOGY (2022)

添加到收藏夹

Review Computer Science, Artificial Intelligence

Interpretable and explainable machine learning: A methods-centric overview with concrete examples

Ricards Marcinkevics, Julia E. Vogt

Summary: Interpretability and explainability are essential for ML and statistical applications in various fields. Although they lack a precise definition, many models and techniques have been developed, with deep learning gaining more attention. This article discusses state-of-the-art examples, including rule-based, sparse, and additive classification models, interpretable representation learning, and methods for explaining black-box models. It emphasizes the importance and relevance of interpretability and explainability, the divide between them, and the biases behind interpretable models and explanation methods.

WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY (2023)

添加到收藏夹

Article Robotics

Socially and Contextually Aware Human Motion and Pose Forecasting

Vida Adeli, Ehsan Adeli, Ian Reid, Juan Carlos Niebles, Hamid Rezatofighi

IEEE ROBOTICS AND AUTOMATION LETTERS (2020)

添加到收藏夹

Article Computer Science, Theory & Methods

A Survey on Deep Learning and Explainability for Automatic Report Generation from Medical Images

Pablo Messina, Pablo Pino, Denis Parra, Alvaro Soto, Cecilia Besa, Sergio Uribe, Marcelo Andia, Cristian Tejos, Claudia Prieto, Daniel Capurro

Summary: Physicians face increasing demand for image-based diagnosis from patients every year, which can be addressed with the recent advancement in artificial intelligence. Survey of works on automatic report generation from medical images using deep neural networks shows progress in datasets, architecture design, explainability, and evaluation metrics, but challenges remain, especially in evaluating the accuracy of generated reports.

ACM COMPUTING SURVEYS (2022)

添加到收藏夹

Article Radiology, Nuclear Medicine & Medical Imaging

High fidelity deep learning-based MRI reconstruction with instance-wise discriminative feature matching loss

Ke Wang, Jonathan Tamir, Alfredo De Goyeneche, Uri Wollner, Rafi Brada, Stella X. Yu, Michael Lustig

Summary: This study aims to improve the fidelity of fine structures and textures in deep learning-based reconstructions. A novel patch-based unsupervised feature loss method is proposed to preserve perceptual similarity and high-order statistics. Experimental results demonstrate that this method can produce more realistic reconstructions with finer textures, sharper edges, and improved overall image quality.

MAGNETIC RESONANCE IN MEDICINE (2022)

添加到收藏夹

Article Plant Sciences

Differential gene expression analysis of the resprouting process in Pinus canariensis provides new insights into a rare trait in conifers

Victor Chano, Oliver Gailing, Carmen Collada, Alvaro Soto

Summary: Resprouting is a crucial trait in population dynamics, and Pinus canariensis is one of the few conifers species capable of resprouting. In this study, we analyzed gene expression during wound-induced resprouting in 5 years-old Canarian pines and identified key differentially expressed genes (DEGs) at different stages of resprouting. Our findings suggest similarities between lateral shoot development in gymnosperms and apical growth in flowering plants, indicating potential homologies between these processes.

PLANT GROWTH REGULATION (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Learning Sentence-Level Representations with Predictive Coding

Vladimir Araujo, Marie-Francine Moens, Alvaro Soto

Summary: Learning sentence representations is important and challenging in deep learning and natural language processing. Previous methods focused on learning contextualized word representations, but failed to capture the structure and discourse relationships in contiguous sentences. This work improves pretrained models by applying predictive coding theory and shows consistent improvement on sentence representations for English and Spanish languages. It also demonstrates the models' ability to capture discourse and pragmatics knowledge through extensive experimentation and validation.

MACHINE LEARNING AND KNOWLEDGE EXTRACTION (2023)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding

Le Xue, Mingfei Gao, Chen Xing, Roberto Martin-Martin, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, Silvio Savarese

Summary: The recognition capabilities of current state-of-the-art 3D models are limited by datasets with a small number of annotated data and a pre-defined set of categories. This study introduces ULIP, a framework that utilizes multimodal information to improve the understanding of 3D modality. ULIP is pre-trained with object triplets from image, text, and 3D point cloud and achieves state-of-the-art performance in 3D classification tasks.

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR (2023)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

Bridging the Visual Semantic Gap in VLN via Semantically Richer Instructions

Joaquin Ossandon, Benjamin Earle, Alvaro Soto

Summary: The Visual-and-Language Navigation task involves navigating an indoor environment using only visual information based on a textual instruction. Existing AI models still struggle with this task, which is easy for humans. The authors propose that the poor utilization of visual information is the main reason for the low performance of current models. They support this hypothesis with experimental evidence, showing that state-of-the-art models are not significantly affected when given limited or no visual data, indicating overfitting to textual instructions. To address this issue, the authors introduce a new data augmentation method that incorporates more explicit visual information in the generation of textual instructions, leading to an 8% increase in performance for unseen environments.

COMPUTER VISION, ECCV 2022, PT XXXVII (2022)

添加到收藏夹

Proceedings Paper Computer Science, Interdisciplinary Applications

Evaluation Benchmarks for Spanish Sentence Representations

Vladimir Araujo, Andres Carvallo, Souvik Kundu, Jose Caneteo, Marcelo Mendoza, Robert E. Mercer, Felipe Bravo-Marquez, Marie-Francine Moens, Alvaro Soto

Summary: With the success of pre-trained language models, there has been an emergence of versions in languages other than English. However, the evaluation methods for these models are limited for languages like Spanish. This paper aims to bridge this gap by introducing two evaluation benchmarks, Spanish SentEval and Spanish DiscoEval, for assessing stand-alone and discourse-aware sentence representations, respectively. The benchmarks include a variety of datasets from different domains, and the authors also evaluate and analyze the capabilities and limitations of recent pre-trained Spanish language models. The findings show that for discourse evaluation tasks, mBERT, a language model trained on multiple languages, generally outperforms models trained solely on Spanish documents. The contribution of this study is to motivate a fairer, more comparable, and less cumbersome approach to evaluating future Spanish language models.

LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (2022)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens

Carlos Hinojosa, Miguel Marquez, Henry Arguello, Ehsan Adeli, Li Fei-Fei, Juan Carlos Niebles

Summary: This paper proposes an optimizing framework that degrades video quality to protect privacy attributes and maintain relevant features for activity recognition.

COMPUTER VISION - ECCV 2022, PT IV (2022)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

Open Vocabulary Object Detection with Pseudo Bounding-Box Labels

Mingfei Gao, Chen Xing, Juan Carlos Niebles, Junnan Li, Ran Xu, Wenhao Liu, Caiming Xiong

Summary: Despite progress in object detection, most methods are limited to a specific set of object categories. This paper proposes a method to automatically generate pseudo bounding-box annotations from image-caption pairs, expanding the base classes and improving object detection performance.

COMPUTER VISION, ECCV 2022, PT X (2022)

添加到收藏夹

Proceedings Paper Computer Science, Theory & Methods

Entropy-based Stability-Plasticity for Lifelong Learning

Vladimir Araujo, Julio Hurtado, Alvaro Soto, Marie-Francine Moens

Summary: The ability of deep learning models to continuously learn is limited compared to humans. To address this issue, we propose a novel method called Entropy-based Stability-Plasticity (ESP) that dynamically determines the modification level of each model layer through a plasticity factor, reducing interference and speeding up training.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 (2022)

添加到收藏夹

Proceedings Paper Business

DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference

Cristobal Eyzaguirre, Felipe del Rio, Vladimir Araujo, Alvaro Soto

Summary: This paper proposes DACT-BERT, a differentiable adaptive computation time strategy for BERT-like models, which controls the number of Transformer blocks that need to be executed at inference time. Experimental results demonstrate that the proposed approach performs well on a reduced computational regime and is competitive in other cases.

PROCEEDINGS OF THE FIRST WORKSHOP ON EFFICIENT BENCHMARKING IN NLP (NLP POWER 2022) (2022)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations

Vladimir Araujo, Andres Villa, Marcelo Mendoza, Marie-Francine Moens, Alvaro Soto

Summary: The study proposes using ideas from predictive coding theory to augment language models and improve performance in discourse relationship detection by learning suitable discourse-level representations.

2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) (2021)

添加到收藏夹

Article Computer Science, Information Systems

Overcoming Catastrophic Forgetting Using Sparse Coding and Meta Learning

Julio Hurtado, Hans Lobel, Alvaro Soto

Summary: This study presents two strategies to address the task interference problem in deep learning, one using sparse coding technique to adaptively allocate model capacity to avoid interference, and the other using meta learning technique to encourage knowledge transfer among tasks.

IEEE ACCESS (2021)

添加到收藏夹

暂无数据

© Peeref 2019-2024. All rights reserved.