Article
Chemistry, Multidisciplinary
Xiaoping Jiang, Huilin Zhao, Junwei Liu
Summary: This research proposes a classification method based on multi-modality image fusion and CNN-PCA-SVM for work condition recognition of visible and infrared gray foam images. The visible and infrared gray images are fused using the parameter adaptive pulse coupled neural network (PAPCNN) method and image quality detection method in the non-subsampled shearlet transform (NSST) domain. The convolution neural network (CNN) serves as a trainable feature extractor to process the fused foam images, while principal component analysis (PCA) and support vector machine (SVM) are used for feature reduction and classification. Experimental results show that the proposed model can accurately fuse foam images and classify flotation conditions.
APPLIED SCIENCES-BASEL
(2023)
Article
Computer Science, Artificial Intelligence
Han Xu, Jiteng Yuan, Jiayi Ma
Summary: This study proposes a novel method called MURF that mutually reinforces image registration and fusion. MURF consists of three modules, which progressively correct global and local offsets during the coarse-to-fine registration process and incorporate texture enhancement into image fusion. Extensive experiments validate the superiority and universality of MURF on different types of multi-modal data.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Xin Deng, Pier Luigi Dragotti
Summary: In this paper, a novel deep convolutional neural network is proposed to address general multi-modal image restoration and fusion problems, drawing inspirations from a new multi-modal convolutional sparse coding model. The proposed CU-Net architecture automatically separates common and unique information, consisting of three modules: unique feature extraction, common feature preservation, and image reconstruction. Extensive numerical results validate the effectiveness of the method on various tasks such as RGB-guided depth image super-resolution and multi-focus image fusion.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2021)
Article
Computer Science, Information Systems
Yuqi Liu, Zehua Sheng, Hui-Liang Shen
Summary: This paper proposes a fusion framework for image deblurring, called Guided Deblurring Fusion Network (GDFNet), to integrate multi-modal information for better deblurring performance. GDFNet uses image fusion techniques to obtain a deblurred image and employs a blur/residual image splitting strategy and a 2-level coarse-to-fine reconstruction strategy to enhance the deblurring result.
Article
Computer Science, Information Systems
Jingdan Li, Yi Wang, Dexin Zhao
Summary: This paper proposes Gated Adaptive Controller Attention (GACA) method, which explores the complementarity of text features with region and grid features through attentional operations and adaptively fuses the two visual features using a gating mechanism to obtain comprehensive image representation. During decoding, a Layer-wise Enhanced Cross-Attention (LECA) module is designed to calculate cross-attention between the generated word embedded vectors and multi-level visual information in the encoder, resulting in enhanced visual features. Extensive experiments demonstrate that our proposed model achieves new state-of-the-art performance on the MS COCO dataset.
MULTIMEDIA SYSTEMS
(2023)
Article
Chemistry, Multidisciplinary
Chinnem Rama Mohan, Siddavaram Kiran, Vasudeva
Summary: Feature extraction is the process of collecting necessary detailed information from a given source for further analysis. The hybrid wavelet fusion algorithm, using Dual-Tree Complex Wavelet Transforms (DTCWT) combined with Stationary Wavelet Transform (SWT), overcomes the limitations of traditional wavelet-based fusion algorithms and preserves directional selectivity and shift invariance.
APPLIED SCIENCES-BASEL
(2023)
Article
Environmental Sciences
Xiangzeng Liu, Haojie Gao, Qiguang Miao, Yue Xi, Yunfeng Ai, Dingguo Gao
Summary: In this paper, a novel method named multi-modal feature self-adaptive transformer (MFST) is proposed for infrared and visible image fusion. This method extracts multi-modal features from input images using a convolutional neural network, and fuses these features using an adaptive fusion strategy. Experimental results demonstrate that the proposed method outperforms other methods in terms of fusion performance.
Article
Computer Science, Artificial Intelligence
Xuejian Li, Shiqiang Ma, Junhai Xu, Jijun Tang, Shengfeng He, Fei Guo
Summary: Automatic segmentation of medical images is crucial for disease diagnosis. This paper proposes a dual-path segmentation model called TranSiam for multi-modal medical images. The model utilizes parallel CNNs and a Transformer layer to extract features from different modalities, and aggregates the features using a locality-aware aggregation block.
EXPERT SYSTEMS WITH APPLICATIONS
(2024)
Article
Engineering, Biomedical
Shenhai Zheng, Jiaxin Tan, Chuangbo Jiang, Laquan Li
Summary: This study aims to design, propose, and validate a deep learning method that extends the application of Transformer to multi-modality medical image segmentation. A novel automated multi-modal Transformer network called AMTNet is introduced for 3D medical image segmentation, and comprehensive experimental analysis on the Prostate and BraTS2021 datasets demonstrates significant improvements over the state-of-the-art segmentation networks. This powerful network enriches the research of the Transformer to multi-modal medical image segmentation.
PHYSICS IN MEDICINE AND BIOLOGY
(2023)
Article
Biology
Xiao Liu, Hongyi Chen, Chong Yao, Rui Xiang, Kun Zhou, Peng Du, Weifan Liu, Jie Liu, Zekuan Yu
Summary: Image fusion techniques are widely used in multi-modal medical image fusion tasks. However, most existing methods neglect the textural details and contrast between the tissues in regions of interest, which can distort important tumor information and limit the clinical applicability of the fused images. To address this issue, we propose a multi-modal MRI fusion generative adversarial network (BTMF-GAN) that aims to achieve a balance between tissue details and structural contrasts in brain tumor, an important region for medical applications.
COMPUTERS IN BIOLOGY AND MEDICINE
(2023)
Article
Biochemistry & Molecular Biology
Weidong Xie, Yushan Fang, Guicheng Yang, Kun Yu, Wei Li
Summary: The significance of multi-modal data becomes evident as the number of modalities in biomedical data continues to increase. However, current multi-modal fusion methods for biomedical data lack effective exploitation of intra- and inter-modal interactions, and the application of powerful fusion methods is rare. In this paper, a novel multi-modal data fusion method is proposed, which utilizes a graph neural network and a 3D convolutional network to identify intra-modal relationships, employs the Low-rank Multi-modal Fusion method to fuse information from different modalities, and incorporates the Cross-modal Transformer to learn relationships between modalities.
Article
Computer Science, Artificial Intelligence
Enqiang Wang, Qing Yu, Yelin Chen, Wushouer Slamu, Xukang Luo
Summary: This study proposes a multi-modal knowledge graph representation learning method using multi-head self-attention, which improves the effectiveness of link prediction by adding rich multi-modal information to entities.
INFORMATION FUSION
(2022)
Article
Computer Science, Information Systems
Hui Liu, Shanshan Li, Jicheng Zhu, Kai Deng, Meng Liu, Liqiang Nie
Summary: Multi-modal medical image fusion is an important research topic in the field of medical imaging, which helps doctors diagnose and treat diseases more efficiently by obtaining informative medical images. Most fusion methods, however, subjectively extract and fuse features, leading to distortion of the unique information of source images. This work presents a novel end-to-end unsupervised network that uses a generator and two symmetrical discriminators to fuse multi-modal medical images. The generator generates a real-like fused image based on a specially designed content and structure loss, while the discriminators distinguish the differences between the fused image and the source ones. The experimental results demonstrate the superiority of this method over cutting-edge baselines, both visually and quantitatively.
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
(2023)
Article
Environmental Sciences
Xinchen Li, Dan Jing, Yachao Li, Liang Guo, Liang Han, Qing Xu, Mengdao Xing, Yihua Hu
Summary: This paper introduces a fusion method for multi-band and polarization synthetic aperture radar (SAR) images, using the non-subsampled shearlet transform (NSST) and extracting the band and polarization difference information for colorization. Experimental results demonstrate that the proposed method can effectively preserve and interpret important information.
Article
Metallurgy & Metallurgical Engineering
Huang Hai-peng, Hao Ben-tian, Ye De-jun, Gao Hao, Li Liang
Summary: In this study, a multi-modal feature fusion network model was constructed based on a laser paint removal experiment, and the attention mechanism was introduced to optimize the detection accuracy. The experimental results showed that the constructed model performed better than the single-modal detection and the accuracy was further improved by optimizing the feature extraction network through the attention mechanism.
JOURNAL OF CENTRAL SOUTH UNIVERSITY
(2022)