The multi-modal fusion in visual question answering: a review of attention mechanisms
出版年份 2023 全文链接
标题
The multi-modal fusion in visual question answering: a review of attention mechanisms
作者
关键词
-
出版物
PeerJ Computer Science
Volume 9, Issue -, Pages e1400
出版商
PeerJ
发表日期
2023-05-30
DOI
10.7717/peerj-cs.1400
参考文献
相关参考文献
注意:仅列出部分参考文献,下载原文获取全部文献信息。- Transformers in Vision: A Survey
- (2022) Salman Khan et al. ACM COMPUTING SURVEYS
- Research on Visual Question Answering Based on GAT Relational Reasoning
- (2022) Yalin Miao et al. NEURAL PROCESSING LETTERS
- Deep Modular Bilinear Attention Network for Visual Question Answering
- (2022) Feng Yan et al. SENSORS
- Sparse co-attention visual question answering networks based on thresholds
- (2022) Zihan Guo et al. APPLIED INTELLIGENCE
- Dual Self-Guided Attention with Sparse Question Networks for Visual Question Answering
- (2022) Xiang SHEN et al. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
- No-Reference Video Quality Assessment Using Multi-Pooled, Saliency Weighted Deep Features and Decision Fusion
- (2022) Domonkos Varga SENSORS
- A Deep Fusion Matching Network Semantic Reasoning Model
- (2022) Wenfeng Zheng et al. Applied Sciences-Basel
- Multi-Modal Alignment of Visual Question Answering Based on Multi-Hop Attention Mechanism
- (2022) Qihao Xia et al. Electronics
- Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network
- (2022) Wenfeng Zheng et al. PeerJ Computer Science
- Attention mechanisms in computer vision: A survey
- (2022) Meng-Hao Guo et al. Computational Visual Media
- SPCA-Net: a based on spatial position relationship co-attention network for visual question answering
- (2022) Feng Yan et al. VISUAL COMPUTER
- Medical visual question answering based on question-type reasoning and semantic space constraint
- (2022) Meiling Wang et al. ARTIFICIAL INTELLIGENCE IN MEDICINE
- A Bi-level representation learning model for medical visual question answering
- (2022) Yong Li et al. JOURNAL OF BIOMEDICAL INFORMATICS
- AMAM: An Attention-based Multimodal Alignment Model for Medical Visual Question Answering
- (2022) Haiwei Pan et al. KNOWLEDGE-BASED SYSTEMS
- Multi-modal co-attention relation networks for visual question answering
- (2022) Zihan Guo et al. VISUAL COMPUTER
- Research on visual question answering based on dynamic memory network model of multiple attention mechanisms
- (2022) Yalin Miao et al. Scientific Reports
- Path-Wise Attention Memory Network for Visual Question Answering
- (2022) Yingxin Xiang et al. Mathematics
- Visual question answering model based on the fusion of multimodal features by a two-way co-attention mechanism
- (2022) Himanshu Sharma et al. IMAGING SCIENCE JOURNAL
- Explanation vs. attention: A two-player game to obtain attention for VQA and visual dialog
- (2022) Badri N. Patro et al. PATTERN RECOGNITION
- Cross-modality co-attention networks for visual question answering
- (2021) Dezhi Han et al. Soft Computing
- Knowledge mapping of computer applications in education using CiteSpace
- (2021) Keshav S. Rawat et al. COMPUTER APPLICATIONS IN ENGINEERING EDUCATION
- Sentence Representation Method Based on Multi-Layer Semantic Network
- (2021) Wenfeng Zheng et al. Applied Sciences-Basel
- Joint embedding VQA model based on dynamic word vector
- (2021) Zhiyang Ma et al. PeerJ Computer Science
- Multimodal feature-wise co-attention method for visual question answering
- (2021) Sheng Zhang et al. Information Fusion
- Multi Visual and Textual Embedding on Visual Question Answering for Blind People
- (2021) Tung Le et al. NEUROCOMPUTING
- A review on the attention mechanism of deep learning
- (2021) Zhaoyang Niu et al. NEUROCOMPUTING
- Dual self-attention with co-attention networks for visual question answering
- (2021) Yun Liu et al. PATTERN RECOGNITION
- Mutual Attention Inception Network for Remote Sensing Visual Question Answering
- (2021) Xiangtao Zheng et al. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
- Knowledge base graph embedding module design for Visual question answering model
- (2021) Wenfeng Zheng et al. PATTERN RECOGNITION
- An Attentive Survey of Attention Models
- (2021) Sneha Chaudhari et al. ACM Transactions on Intelligent Systems and Technology
- Object-difference drived graph convolutional networks for visual question answering
- (2020) Xi Zhu et al. MULTIMEDIA TOOLS AND APPLICATIONS
- Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval
- (2020) Jing Yu et al. IEEE TRANSACTIONS ON MULTIMEDIA
- Adversarial Learning With Multi-Modal Attention for Visual Question Answering
- (2020) Yun Liu et al. IEEE Transactions on Neural Networks and Learning Systems
- MRA-Net: Improving VQA Via Multi-Modal Relation Attention Network
- (2020) Liang Peng et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
- DRAU: Dual Recurrent Attention Units for Visual Question Answering
- (2019) Ahmed Osman et al. COMPUTER VISION AND IMAGE UNDERSTANDING
- Focal Visual-Text Attention for Memex Question Answering
- (2019) Junwei Liang et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
- Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
- (2019) Ramprasaath R. Selvaraju et al. INTERNATIONAL JOURNAL OF COMPUTER VISION
- Multimodal feature fusion by relational reasoning and attention for visual question answering
- (2019) Weifeng Zhang et al. Information Fusion
- Visual question answering model based on visual relationship detection
- (2019) Yuling Xi et al. SIGNAL PROCESSING-IMAGE COMMUNICATION
- Squeeze-and-Excitation Networks
- (2019) Jie Hu et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
- Vision-to-Language Tasks Based on Attributes and Attention Mechanism
- (2019) Xuelong Li et al. IEEE Transactions on Cybernetics
- Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering
- (2018) Zhou Yu et al. IEEE Transactions on Neural Networks and Learning Systems
- Multi attention module for visual tracking
- (2018) Boyu Chen et al. PATTERN RECOGNITION
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- (2017) Shaoqing Ren et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Find Funding. Review Successful Grants.
Explore over 25,000 new funding opportunities and over 6,000,000 successful grants.
ExploreDiscover Peeref hubs
Discuss science. Find collaborators. Network.
Join a conversation