The multi-modal fusion in visual question answering: a review of attention mechanisms
Published 2023 View Full Article
- Home
- Publications
- Publication Search
- Publication Details
Title
The multi-modal fusion in visual question answering: a review of attention mechanisms
Authors
Keywords
-
Journal
PeerJ Computer Science
Volume 9, Issue -, Pages e1400
Publisher
PeerJ
Online
2023-05-30
DOI
10.7717/peerj-cs.1400
References
Ask authors/readers for more resources
Related references
Note: Only part of the references are listed.- Transformers in Vision: A Survey
- (2022) Salman Khan et al. ACM COMPUTING SURVEYS
- Research on Visual Question Answering Based on GAT Relational Reasoning
- (2022) Yalin Miao et al. NEURAL PROCESSING LETTERS
- Deep Modular Bilinear Attention Network for Visual Question Answering
- (2022) Feng Yan et al. SENSORS
- Sparse co-attention visual question answering networks based on thresholds
- (2022) Zihan Guo et al. APPLIED INTELLIGENCE
- Dual Self-Guided Attention with Sparse Question Networks for Visual Question Answering
- (2022) Xiang SHEN et al. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
- No-Reference Video Quality Assessment Using Multi-Pooled, Saliency Weighted Deep Features and Decision Fusion
- (2022) Domonkos Varga SENSORS
- A Deep Fusion Matching Network Semantic Reasoning Model
- (2022) Wenfeng Zheng et al. Applied Sciences-Basel
- Multi-Modal Alignment of Visual Question Answering Based on Multi-Hop Attention Mechanism
- (2022) Qihao Xia et al. Electronics
- Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network
- (2022) Wenfeng Zheng et al. PeerJ Computer Science
- Attention mechanisms in computer vision: A survey
- (2022) Meng-Hao Guo et al. Computational Visual Media
- SPCA-Net: a based on spatial position relationship co-attention network for visual question answering
- (2022) Feng Yan et al. VISUAL COMPUTER
- Medical visual question answering based on question-type reasoning and semantic space constraint
- (2022) Meiling Wang et al. ARTIFICIAL INTELLIGENCE IN MEDICINE
- A Bi-level representation learning model for medical visual question answering
- (2022) Yong Li et al. JOURNAL OF BIOMEDICAL INFORMATICS
- AMAM: An Attention-based Multimodal Alignment Model for Medical Visual Question Answering
- (2022) Haiwei Pan et al. KNOWLEDGE-BASED SYSTEMS
- Multi-modal co-attention relation networks for visual question answering
- (2022) Zihan Guo et al. VISUAL COMPUTER
- Research on visual question answering based on dynamic memory network model of multiple attention mechanisms
- (2022) Yalin Miao et al. Scientific Reports
- Path-Wise Attention Memory Network for Visual Question Answering
- (2022) Yingxin Xiang et al. Mathematics
- Visual question answering model based on the fusion of multimodal features by a two-way co-attention mechanism
- (2022) Himanshu Sharma et al. IMAGING SCIENCE JOURNAL
- Explanation vs. attention: A two-player game to obtain attention for VQA and visual dialog
- (2022) Badri N. Patro et al. PATTERN RECOGNITION
- Cross-modality co-attention networks for visual question answering
- (2021) Dezhi Han et al. Soft Computing
- Knowledge mapping of computer applications in education using CiteSpace
- (2021) Keshav S. Rawat et al. COMPUTER APPLICATIONS IN ENGINEERING EDUCATION
- Sentence Representation Method Based on Multi-Layer Semantic Network
- (2021) Wenfeng Zheng et al. Applied Sciences-Basel
- Joint embedding VQA model based on dynamic word vector
- (2021) Zhiyang Ma et al. PeerJ Computer Science
- Multimodal feature-wise co-attention method for visual question answering
- (2021) Sheng Zhang et al. Information Fusion
- Multi Visual and Textual Embedding on Visual Question Answering for Blind People
- (2021) Tung Le et al. NEUROCOMPUTING
- A review on the attention mechanism of deep learning
- (2021) Zhaoyang Niu et al. NEUROCOMPUTING
- Dual self-attention with co-attention networks for visual question answering
- (2021) Yun Liu et al. PATTERN RECOGNITION
- Mutual Attention Inception Network for Remote Sensing Visual Question Answering
- (2021) Xiangtao Zheng et al. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
- Knowledge base graph embedding module design for Visual question answering model
- (2021) Wenfeng Zheng et al. PATTERN RECOGNITION
- An Attentive Survey of Attention Models
- (2021) Sneha Chaudhari et al. ACM Transactions on Intelligent Systems and Technology
- Object-difference drived graph convolutional networks for visual question answering
- (2020) Xi Zhu et al. MULTIMEDIA TOOLS AND APPLICATIONS
- Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval
- (2020) Jing Yu et al. IEEE TRANSACTIONS ON MULTIMEDIA
- Adversarial Learning With Multi-Modal Attention for Visual Question Answering
- (2020) Yun Liu et al. IEEE Transactions on Neural Networks and Learning Systems
- MRA-Net: Improving VQA Via Multi-Modal Relation Attention Network
- (2020) Liang Peng et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
- DRAU: Dual Recurrent Attention Units for Visual Question Answering
- (2019) Ahmed Osman et al. COMPUTER VISION AND IMAGE UNDERSTANDING
- Focal Visual-Text Attention for Memex Question Answering
- (2019) Junwei Liang et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
- Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
- (2019) Ramprasaath R. Selvaraju et al. INTERNATIONAL JOURNAL OF COMPUTER VISION
- Multimodal feature fusion by relational reasoning and attention for visual question answering
- (2019) Weifeng Zhang et al. Information Fusion
- Visual question answering model based on visual relationship detection
- (2019) Yuling Xi et al. SIGNAL PROCESSING-IMAGE COMMUNICATION
- Squeeze-and-Excitation Networks
- (2019) Jie Hu et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
- Vision-to-Language Tasks Based on Attributes and Attention Mechanism
- (2019) Xuelong Li et al. IEEE Transactions on Cybernetics
- Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering
- (2018) Zhou Yu et al. IEEE Transactions on Neural Networks and Learning Systems
- Multi attention module for visual tracking
- (2018) Boyu Chen et al. PATTERN RECOGNITION
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- (2017) Shaoqing Ren et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Find the ideal target journal for your manuscript
Explore over 38,000 international journals covering a vast array of academic fields.
SearchBecome a Peeref-certified reviewer
The Peeref Institute provides free reviewer training that teaches the core competencies of the academic peer review process.
Get Started