4.6 Article

Exploring Deep Learning for View-Based 3D Model Retrieval

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3377876

关键词

3D model retrieval; benchmark; deep learning features; handcrafted feature

资金

  1. National Natural Science Foundation of China [61872270, 61572357]
  2. National Key R&D Program of China [2019YFBB1404700]
  3. Jinan's innovation team [2018GXRC014]

向作者/读者索取更多资源

In recent years, view-based 3D model retrieval has become one of the research focuses in the field of computer vision and machine learning. In fact, the 3D model retrieval algorithm consists of feature extraction and similarity measurement, and the robust features play a decisive role in the similarity measurement. Although deep learning has achieved comprehensive success in the field of computer vision, deep learning features are used for 3D model retrieval only in a small number of works. To the best of our knowledge, there is no benchmark to evaluate these deep learning features. To tackle this problem, in this work we systematically evaluate the performance of deep learning features in view-based 3D model retrieval on four popular datasets (ETH, NTU60, PSB, and MVRED) by different kinds of similarity measure methods. In detail, the performance of hand-crafted features and deep learning features are compared, and then the robustness of deep learning features is assessed. Finally, the difference between single-view deep learning features and multi-view deep learning features is also evaluated. By quantitatively analyzing the performances on different datasets, it is clear that these deep learning features can consistently outperform all of the hand-crafted features, and they are also more robust than the hand-crafted features when different degrees of noise are added into the image. The exploration of latent relationships among different views in multi-view deep learning network architectures shows that the performance of multi-view deep learning outperforms that of single-view deep learning features with low computational complexity.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

Segmentation of ultrasound image sequences by combing a novel deep siamese network with a deformable contour model

Bo Ni, Zhiyuan Liu, Xiantao Cai, Michele Nappi, Shaohua Wan

Summary: This paper proposes a novel deformable contour model for segmenting ultrasound image sequences. The model utilizes the power of deep learning network in learning image features to overcome the challenges in ultrasound image segmentation. Experimental results show that the proposed method outperforms state-of-the-art methods in clinical ultrasound images.

NEURAL COMPUTING & APPLICATIONS (2023)

Article Computer Science, Information Systems

Local Correlation Ensemble with GCN Based on Attention Features for Cross-domain Person Re-ID

Yue Zhang, Fanghui Zhang, Yi Jin, Yigang Cen, Viacheslav Voronin, Shaohua Wan

Summary: In this paper, a novel local correlation ensemble model is proposed to address the cross-domain problem in the Re-ID task. The model improves the utilization of unlabeled samples in the target domain by focusing on person's features and calculating the distance between nodes. Experimental results on large-scale public Re-ID datasets demonstrate the effectiveness of the proposed method.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Article Computer Science, Artificial Intelligence

Joint Intent Detection Model for Task-oriented Human-Computer Dialogue System using Asynchronous Training

Yirui Wu, Hao Li, Lilai Zhang, Chen Dong, Qian Huang, Shaohua Wan

Summary: Accurately understanding low-resource languages is crucial for task-oriented human-computer dialogue systems. This involves intent detection and slot filling, which face challenges due to semantic ambiguity and implicit intentions. To address these issues, a joint intent detection method using asynchronous training strategy is proposed, which encodes local text information and emphasizes relationship among words. By fusing hidden states or fine-tuning the network with key information, the relevance between intent detection and slot filling is greatly improved. Validation on airline travel (ATIS) and electricity service (ECSF) datasets achieves 97.49% and 89.68% accuracy, respectively, confirming the effectiveness of joint learning and asynchronous training.

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING (2023)

Article Engineering, Civil

Edge Intelligence Empowered Vehicle Detection and Image Segmentation for Autonomous Vehicles

Chen Chen, Chenyu Wang, Bin Liu, Ci He, Li Cong, Shaohua Wan

Summary: Edge intelligence technology combined with computer vision can improve traffic information processing and enhance vehicle detection ability. Additionally, using an improved image segmentation algorithm helps reduce network size while improving segmentation accuracy.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2023)

Article Computer Science, Information Systems

DA-Net: Dual Attention Network for Flood Forecasting

Qian Cheng, Yirui Wu, Aniello Castiglione, Fabio Narducci, Shaohua Wan

Summary: This paper introduces a deep learning-based flood prediction method, presenting a dual attention embedding network (DA-Net). The proposed method utilizes a convolution self-attention module (CSA) and a Temporal-related Feature Attention (TFA) module to capture both local and global flood features, achieving accurate prediction results.

JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY (2023)

Article Computer Science, Artificial Intelligence

GDRL: An interpretable framework for thoracic pathologic prediction

Yirui Wu, Hao Li, Xi Feng, Andrea Casanova, Andrea F. Abate, Shaohua Wan

Summary: Deep learning methods have shown significant performance in medical image analysis tasks, but lack interpretability in feature extraction and decision processes. To address this, a novel Group-Disentangled Representation Learning framework (GDRL) is proposed, which disentangles latent space into disease concepts with abundant and non-overlapping feature explanations. By emphasizing the linking relationship between semantical concepts of disease and low-level visual features, GDRL enhances interpretability and showcases potential in predicting diseases from chest X-ray images.

PATTERN RECOGNITION LETTERS (2023)

Article Chemistry, Analytical

LoRa-Based IoT Network Assessment in Rural and Urban Scenarios

Aikaterini I. Griva, Achilles D. Boursianis, Shaohua Wan, Panagiotis Sarigiannidis, Konstantinos E. Psannis, George Karagiannidis, Sotirios K. Goudos

Summary: The implementation of smart networks has been greatly advanced by the development of IoT, with LoRa being a prominent technology due to its long-distance transmission capabilities with low power consumption. This study simulated various environments to assess network performance based on different factors and parameters. Path loss model, deployment area size, transmission power, spreading factor, number of nodes and gateways, and antenna gain significantly affect the energy consumption and data extraction rate of LoRa networks. The research performed simulations using the FLoRa framework in OMNeT++, investigating rural and urban environments, as well as a parking area model. The results emphasize the importance of optimizing key parameters for the deployment of smart networks.

SENSORS (2023)

Article Computer Science, Hardware & Architecture

Edge-AI-Driven Framework with Efficient Mobile Network Design for Facial Expression Recognition

Yirui Wu, Lilai Zhang, Zonghua Gu, Hu Lu, Shaohua Wan

Summary: This article proposes an Edge-AI-driven framework for Facial Expression Recognition (FER) in the wild, addressing challenges such as occlusions, illumination, scale, and head pose variations. It introduces two attention modules, Arbitrary-oriented Spatial Pooling (ASP) and Scalable Frequency Pooling (SFP), for effective feature extraction to improve classification accuracy. The article also presents an edge-cloud joint inference architecture for FER, achieving low-latency inference with a lightweight backbone network on the edge device and optional attention modules partially offloaded to the cloud. Performance evaluation shows a good balance between classification accuracy and inference latency in this approach.

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Searching sharing relationship for instance segmentation decoder

Yuling Xi, Ning Wang, Shaohua Wan, Xiaoming Wang, Peng Wang, Yanning Zhang

Summary: Instance segmentation is a visual task that requires predicting per-pixel masks and category labels for each instance. We propose using Neural Architecture Search (NAS) to automatically search for a hardware and memory-friendly feature sharing branch, and our method can be applied to similar multi-task networks. Experimental results show that our method exceeds classical parallel decoder networks in terms of bounding box mAP and segmentation mAP.

APPLIED INTELLIGENCE (2023)

Article Computer Science, Information Systems

Generating live commentary for marine traffic scenarios based on multi-model learning

Rui Zhang, Xiaojie Li, Yifan Zhuo, Kezhong Liu, Xian Zhong, Shaohua Wan

Summary: The Internet of Things (IoT) plays a crucial role in maritime transportation by enabling the construction of marine traffic scenarios, improving efficiency and safety. To address challenges in marine traffic monitoring, a multi-model learning approach is proposed, along with an innovative dataset and a text generation model based on a multi-modal Transformer architecture. Experimental results show that our approach effectively generates accurate and informative descriptions of maritime activity.

COMPUTER COMMUNICATIONS (2023)

Article Computer Science, Information Systems

Two Path Gland Segmentation Algorithm of Colon Pathological Image Based on Local Semantic Guidance

Songtao Ding, Hongyu Wang, Hu Lu, Michele Nappi, Shaohua Wan

Summary: In this paper, a two-path gland segmentation algorithm of colon pathological image based on local semantic guidance is proposed. The improved candidate region search algorithm is employed to generate sub-datasets sensitive to specific features. The semantic feature-guided model is used to extract local adenocarcinoma features and enhance the network's learning ability to gland morphological features.

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS (2023)

Editorial Material Computer Science, Artificial Intelligence

Editorial: Ontology-based Knowledge Presentation and Computational Linguistics for Semantic Big Social Data Analytics in Asian Social Networks

Chinmay Chakraborty, Shaohua Wan, Mohammad R. Khosravi

Summary: Data-driven ontology-based knowledge (OK) presentation and computational linguistics for evolving semantic Asian social networks (ASNs) can provide a robust and real-time data mapping platform, named OK-ASN, that allows massive access across heterogeneous big data sources on the web. It utilizes computational intelligence, web-of-things (WoT) architecture, semantic features, statistical learning and pattern recognition, database management, computer vision, cyber-security, and language processing. OK-ASN is a critical strategy for mining WoT big data and promoting enterprises in various sectors from social media to medical and industrial fields.

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING (2023)

Article Automation & Control Systems

DLS-GAN: Generative Adversarial Nets for Defect Location Sensitive Data Augmentation

Wei Li, Chengchun Gu, Jinlin Chen, Chao Ma, Xiaowu Zhang, Bin Chen, Shaohua Wan

Summary: This paper proposes a data augmentation model called "DLS-GAN" to address the problem of defect location sensitive data augmentation. The model modifies the generator and introduces discriminators, and the experimental results show that DLS-GAN can synthesize high-quality images with desired defects better than state-of-the-art generative models on different types of DLS datasets.

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING (2023)

Article Computer Science, Hardware & Architecture

Decision Boundary-Aware Data Augmentation for Adversarial Training

Chen Chen, Jingfeng Zhang, Xilie Xu, Lingjuan Lyu, Chaochao Chen, Tianlei Hu, Gang Chen

Summary: Adversarial training (AT) is a method to improve the robustness of deep neural networks by training on adversarial variants generated from natural examples. However, as training progresses, the training data becomes less attackable, undermining the enhancement of model robustness. To address this issue, this paper proposes a Decision boundary-aware data Augmentation framework (CODA) that utilizes meta information from previous epochs to guide the augmentation process and generate attackable data close to the decision boundary. CODA outperforms vanilla mixup by providing a higher ratio of attackable data, enhancing model robustness while mitigating the linear behavior between classes that is unfavorable for adversarial training. Experimental results demonstrate that CODA improves adversarial robustness across various training methods and datasets.

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING (2023)

Article Computer Science, Information Systems

Joint QoS Aware Admission Control and Power Allocation in NOMA Downlink Networks

Sotirios K. Goudos, Panagiotis D. Diamantoulakis, Achilles D. Boursianis, Panagiotis Sarigiannidis, Konstantinos E. Psannis, Mohammad Abdul Matin, Shaohua Wan, George K. Karagiannidis

Summary: In this work, we address the problem of joint power allocation and user association for non-orthogonal multiple access (NOMA) in downlink networks based on quality-of-service. Due to its non-convex form and the large number of optimization variables, the problem is challenging and we propose two nature-inspired algorithms with low complexity for solving it. We investigate the impact of different network parameters on increasing users and show that evolutionary algorithms are effective in solving this problem, outperforming randomly generated solutions. Furthermore, the advantages of NOMA over OMA become more evident as the number of users increases.

IEEE ACCESS (2023)

暂无数据