4.7 Article

Cross-domain object detection using unsupervised image translation

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 192, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2021.116334

Keywords

Unsupervised Domain Adaptation; Object detection; Generative Adversarial Networks; Unpaired image-to-image translation; Style-transfer

Funding

  1. Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES, Brazil) [001]
  2. Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq, Brazil) [311654/2019-3, 310330/2020-3, 200864/2019-0]
  3. Fundacao de Amparo a Pesquisa do Espirito Santo (FAPES, Brazil) [256/2021, 84412844]

Ask authors/readers for more resources

Unsupervised domain adaptation for object detection focuses on transferring detectors from a source domain to an unseen target domain, with recent methods showing promise by aligning intermediate features. This study proposes a method to generate artificial data in the target domain using unsupervised image translators, leading to significant improvements in real-world scenarios for autonomous driving and outperforming existing methods in most cases. The proposed approach offers a less complex yet more effective solution with improved interpretability.
Unsupervised domain adaptation for object detection addresses the adaption of detectors trained in a source domain to work accurately in an unseen target domain. Recently, methods approaching the alignment of the intermediate features proven to be promising, achieving state-of-the-art results. However, these methods are laborious to implement and hard to interpret. Although promising, there is still room for improvements to close the performance gap toward the upper-bound (when training with the target data). In this work, we propose a method to generate an artificial dataset in the target domain to train an object detector. We employed two unsupervised image translators (CycleGAN and an AdaIN-based model) using only annotated data from the source domain and non-annotated data from the target domain. Our key contributions are the proposal of a less complex yet more effective method that also has an improved interpretability. Results on real-world scenarios for autonomous driving show significant improvements, outperforming state-of-the-art methods in most cases, further closing the gap toward the upper-bound.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Information Systems

Deep Unsupervised Key Frame Extraction for Efficient Video Classification

Hao Tang, Lei Ding, Songsong Wu, Bin Ren, Nicu Sebe, Paolo Rota

Summary: This paper proposes an unsupervised method for key frame extraction by combining convolutional neural network and temporal segment density peaks clustering. The proposed method addresses the issue of imbalance between performance and efficiency in large-scale video classification. Experimental results demonstrate that the proposed strategy achieves competitive performance and efficiency compared with the state-of-the-art approaches.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Article Computer Science, Information Systems

Bidirectional Transformer GAN for Long-term Human Motion Prediction

Mengyi Zhao, Hao Tang, Pan Xie, Shuling Dai, Nicu Sebe, Wei Wang

Summary: Conventional motion prediction methods tend to focus on short-term prediction, leading to freezing forecasting problem where predicted long-term motions become average poses. To address this, we propose a novel Bidirectional Transformer-based Generative Adversarial Network (BiTGAN) for long-term human motion prediction. By utilizing both forward and backward directions, our bidirectional setup ensures consistent and smooth generation. To make full use of history motions, we split them into two parts and use them as encoder and decoder inputs respectively. Additionally, we introduce the soft dynamic time warping (Soft-DTW) loss for better maintaining local and global similarities, and employ a dual-discriminator to distinguish predicted sequences. Experimental results on Human3.6M dataset show that our BiTGAN achieves state-of-the-art performance with a 4% average error reduction for all actions.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2023)

Article Computer Science, Artificial Intelligence

Self-training transformer for source-free domain adaptation

Guanglei Yang, Zhun Zhong, Mingli Ding, Nicu Sebe, Elisa Ricci

Summary: In this paper, the authors study source-free domain adaptation and propose the TransDA framework based on Transformer to address the issue of unavailable source data. The framework utilizes attention modules and self-supervised knowledge distillation to improve the model's generalization ability and accuracy.

APPLIED INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Mitigating robust overfitting via self-residual-calibration regularization

Hong Liu, Zhun Zhong, Nicu Sebe, Shin'ichi Satoh

Summary: The issue of overfitting in adversarial training has gained attention in the AI and machine learning community. This paper evaluates the performance of several calibration methods on robust models and proposes a regularization method called Self-Residual-Calibration (SRC), which effectively mitigates overfitting while improving robustness. The results show that SRC is complementary to other regularization methods and achieves top performance on the benchmark leaderboard.

ARTIFICIAL INTELLIGENCE (2023)

Article Computer Science, Interdisciplinary Applications

Lower-limb kinematic reconstruction during pedaling tasks from EEG signals using Unscented Kalman filter

Cristian Felipe Blanco-Diaz, Cristian David Guerrero-Mendez, Denis Delisle-Rodriguez, Alberto Ferreira de Souza, Claudine Badue, Teodiano Freire Bastos-Filho

Summary: This study proposes a nonlinear neural decoder using an Unscented Kalman Filter (UKF) to infer lower-limb kinematics from EEG signals during pedaling. The results demonstrated maximum decoding accuracy using slow cortical potentials in the delta band (0.1-4 Hz) of 0.33 for Pearson's r-value and 8 for the signal-to-noise ratio (SNR). This opens the door to the development of closed-loop EEG-based BCI systems for kinematic monitoring during pedaling rehabilitation tasks.

COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING (2023)

Article Computer Science, Artificial Intelligence

Fast Differentiable Matrix Square Root and Inverse Square Root

Yue Song, Nicu Sebe, Wei Wang

Summary: This paper proposes two more efficient variants, Matrix Taylor Polynomial (MTP) and Matrix Pade Approximants (MPA), to compute the differentiable matrix square root and inverse square root. Numerical tests and real-world applications demonstrate the superior performance and competitive speed of both methods.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Radiology, Nuclear Medicine & Medical Imaging

EEG motor imagery classification using deep learning approaches in naive BCI users

Cristian D. Guerrero-Mendez, Cristian F. Blanco-Diaz, Andres F. Ruiz-Olaya, Alberto Lopez-Delis, Sebastian Jaramillo-Isaza, Rafhael Milanezi Andrade, Alberto Ferreira De Souza, Denis Delisle-Rodriguez, Anselmo Frizera-Neto, Teodiano F. Bastos-Filho

Summary: This study compares the performance of naive BCI users using three different deep learning (DL) methods. The results show that the LSTM-BiLSTM-based approach performs the best, with a 32% improvement compared to baseline methods. It is expected that this study will increase the controllability, usability, and reliability of robotic devices for naive BCI users.

BIOMEDICAL PHYSICS & ENGINEERING EXPRESS (2023)

Article Computer Science, Artificial Intelligence

Interactive Neural Painting

Elia Peruzzo, Willi Menapace, Vidit Goel, Federica Arrigoni, Hao Tang, Xingqian Xu, Arman Chopikyan, Nikita Orlov, Yuxiao Hu, Humphrey Shi, Nicu Sebe, Elisa Ricci

Summary: This paper proposes a novel approach for Interactive Neural Painting, which assists users in creating realistic artworks by suggesting next strokes. The proposed I-Paint method, based on a conditional transformer Variational AutoEncoder (VAE) architecture, shows promising results compared to existing techniques. Additionally, two new datasets are introduced to evaluate and foster further research in this area.

COMPUTER VISION AND IMAGE UNDERSTANDING (2023)

Article Engineering, Civil

100-Driver: A Large-Scale, Diverse Dataset for Distracted Driver Classification

Jing Wang, Wenjing Li, Fang Li, Jun Zhang, Zhongcheng Wu, Zhun Zhong, Nicu Sebe

Summary: This paper introduces a large-scale and diverse posture-based distracted driver dataset, which includes over 470K images captured by 4 cameras observing 100 drivers from 5 vehicles over 79 hours. The researchers provide a detailed data analysis and present 4 different settings to investigate practical problems of distracted driving, including traditional setting and 3 challenging settings with domain shifts. Through comprehensive experiments, the importance of this dataset is demonstrated, offering new opportunities for further development of distracted driving research.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

A Memorizing and Generalizing Framework for Lifelong Person Re-Identification

Nan Pu, Zhun Zhong, Nicu Sebe, Michael S. Lew

Summary: This paper introduces the lifelong person re-identification (LReID) task and proposes a new MEmorizing and GEneralizing (MEGE) framework to prevent forgetting and improve generalization ability. The framework consists of Adaptive Knowledge Accumulation (AKA) and differentiable Ranking Consistency Distillation (RCD) modules. Experimental results demonstrate that the MEGE framework significantly improves performance on both seen and unseen domains.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Information Systems

ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image Translation

Yahui Liu, Yajing Chen, Linchao Bao, Nicu Sebe, Bruno Lepri, Marco De Nadai

Summary: Recently, there has been a growing interest in utilizing pre-trained unconditional image generators for image editing, but translating images to multiple visual domains using these methods is still challenging. Existing approaches often fail to preserve the domain-invariant part of the image or handle multiple domains and multi-modal translations. This work proposes an implicit style function (ISF) that enables straightforward multi-modal and multi-domain image-to-image translation using pre-trained unconditional generators. The experiments demonstrate significant improvements over the baselines in manipulating human faces and animal images. The model allows for cost-effective, high-resolution, multi-modal unsupervised image-to-image translations using pre-trained unconditional GANs. The code and data are available at: https://github.com/yhlleo/stylegan-mmuit.

IEEE TRANSACTIONS ON MULTIMEDIA (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Overlap-guided Gaussian Mixture Models for Point Cloud Registration

Guofeng Mei, Fabio Poiesi, Cristiano Saltori, Jian Zhang, Elisa Ricci, Nicu Sebe

Summary: This paper proposes a novel overlap-guided probabilistic registration approach that computes the optimal transformation from matched Gaussian Mixture Model (GMM) parameters. The method introduces a Transformer-based detection module to detect overlapping regions and uses GMMs to represent the input point clouds. Experimental results demonstrate that the method outperforms state-of-the-art methods in terms of registration accuracy and efficiency when dealing with point clouds with partial overlap and different densities.

2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) (2023)

Article Computer Science, Artificial Intelligence

Orthogonal SVD Covariance Conditioning and Latent Disentanglement

Yue Song, Nicu Sebe, Wei Wang

Summary: This article investigates how to improve the covariance conditioning by enforcing orthogonality to the Pre-SVD layer, and proposes the Nearest Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR) as methods. Experimental results demonstrate that these methods can simultaneously improve covariance conditioning and generalization, and combining them with orthogonal weight can further boost performance. Additionally, a series of experiments show the benefits of orthogonality techniques for better latent disentanglement in generative models.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Theory & Methods

Logit Margin Matters: Improving Transferable Targeted Adversarial Attack by Logit Calibration

Juanjuan Weng, Zhiming Luo, Shaozi Li, Nicu Sebe, Zhun Zhong

Summary: This paper investigates the transferability of targeted adversarial examples and finds that the traditional Cross-Entropy (CE) loss function is insufficient. To address this issue, two simple and effective logit calibration methods are proposed to increase transferability.

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY (2023)

Review Computer Science, Artificial Intelligence

A comprehensive review of slope stability analysis based on artificial intelligence methods

Wei Gao, Shuangshuang Ge

Summary: This study provides a comprehensive review of slope stability research based on artificial intelligence methods, focusing on slope stability computation and evaluation. The review covers studies using quasi-physical intelligence methods, simulated evolutionary methods, swarm intelligence methods, hybrid intelligence methods, artificial neural network methods, vector machine methods, and other intelligence methods. The merits, demerits, and state-of-the-art research advancement of these studies are analyzed, and possible research directions for slope stability investigation based on artificial intelligence methods are suggested.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

Machine learning approaches for lateral strength estimation in squat shear walls: A comparative study and practical implications

Khuong Le Nguyen, Hoa Thi Trinh, Saeed Banihashemi, Thong M. Pham

Summary: This study investigated the influence of input parameters on the shear strength of RC squat walls and found that ensemble learning models, particularly XGBoost, can effectively predict the shear strength. The axial load had a greater influence than reinforcement ratio, and longitudinal reinforcement had a more significant impact compared to horizontal and vertical reinforcement. The performance of XGBoost model outperforms traditional design models and reducing input features still yields reliable predictions.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

DHESN: A deep hierarchical echo state network approach for algal bloom prediction

Bo Hu, Huiyan Zhang, Xiaoyi Wang, Li Wang, Jiping Xu, Qian Sun, Zhiyao Zhao, Lei Zhang

Summary: A deep hierarchical echo state network (DHESN) is proposed to address the limitations of shallow coupled structures. By using transfer entropy, candidate variables with strong causal relationships are selected and a hierarchical reservoir structure is established to improve prediction accuracy. Simulation results demonstrate that DHESN performs well in predicting algal bloom.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

Learning high-dependence Bayesian network classifier with robust topology

Limin Wang, Lingling Li, Qilong Li, Kuo Li

Summary: This paper discusses the urgency of learning complex multivariate probability distributions due to the increase in data variability and quantity. It introduces a highly scalable classifier called TAN, which utilizes maximum weighted spanning tree (MWST) for graphical modeling. The paper theoretically proves the feasibility of extending one-dependence MWST to model high-dependence relationships and proposes a heuristic search strategy to improve the fitness of the extended topology to data. Experimental results demonstrate that this algorithm achieves a good bias-variance tradeoff and competitive classification performance compared to other high-dependence or ensemble learning algorithms.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

Make a song curative: A spatio-temporal therapeutic music transfer model for anxiety reduction

Zhejing Hu, Gong Chen, Yan Liu, Xiao Ma, Nianhong Guan, Xiaoying Wang

Summary: Anxiety is a prevalent issue and music therapy has been found effective in reducing anxiety. To meet the diverse needs of individuals, a novel model called the spatio-temporal therapeutic music transfer model (StTMTM) is proposed.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

A modified reverse-based analysis logic mining model with Weighted Random 2 Satisfiability logic in Discrete Hopfield Neural Network and multi-objective training of Modified Niched Genetic Algorithm

Nur Ezlin Zamri, Mohd. Asyraf Mansor, Mohd Shareduwan Mohd Kasihmuddin, Siti Syatirah Sidik, Alyaa Alway, Nurul Atiqah Romli, Yueling Guo, Siti Zulaikha Mohd Jamaludin

Summary: In this study, a hybrid logic mining model was proposed by combining the logic mining approach with the Modified Niche Genetic Algorithm. This model improves the generalizability and storage capacity of the retrieved induced logic. Various modifications were made to address other issues. Experimental results demonstrate that the proposed model outperforms baseline methods in terms of accuracy, precision, specificity, and correlation coefficient.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

On taking advantage of opportunistic meta-knowledge to reduce configuration spaces for automated machine learning

David Jacob Kedziora, Tien-Dung Nguyen, Katarzyna Musial, Bogdan Gabrys

Summary: The paper addresses the problem of efficiently optimizing machine learning solutions by reducing the configuration space of ML pipelines and leveraging historical performance. The experiments conducted show that opportunistic/systematic meta-knowledge can improve ML outcomes, and configuration-space culling is optimal when balanced. The utility and impact of meta-knowledge depend on various factors and are crucial for generating informative meta-knowledge bases.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

Optimal location for an EVPL and capacitors in grid for voltage profile and power loss: FHO-SNN approach

G. Sophia Jasmine, Rajasekaran Stanislaus, N. Manoj Kumar, Thangamuthu Logeswaran

Summary: In the context of a rapidly expanding electric vehicle market, this research investigates the ideal locations for EV charging stations and capacitors in power grids to enhance voltage stability and reduce power losses. A hybrid approach combining the Fire Hawk Optimizer and Spiking Neural Network is proposed, which shows promising results in improving system performance. The optimization approach has the potential to enhance the stability and efficiency of electric grids.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

NLP-based approach for automated safety requirements information retrieval from project documents

Zhijiang Wu, Guofeng Ma

Summary: This study proposes a natural language processing-based framework for requirement retrieval and document association, which can help to mine and retrieve documents related to project managers' requirements. The framework analyzes the ontology relevance and emotional preference of requirements. The results show that the framework performs well in terms of iterations and threshold, and there is a significant matching between the retrieved documents and the requirements, which has significant managerial implications for construction safety management.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

Dog nose-print recognition based on the shape and spatial features of scales

Yung-Kuan Chan, Chuen-Horng Lin, Yuan-Rong Ben, Ching-Lin Wang, Shu-Chun Yang, Meng-Hsiun Tsai, Shyr-Shen Yu

Summary: This study proposes a novel method for dog identification using nose-print recognition, which can be applied to controlling stray dogs, locating lost pets, and pet insurance verification. The method achieves high recognition accuracy through two-stage segmentation and feature extraction using a genetic algorithm.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

Fostering supply chain resilience for omni-channel retailers: A two-phase approach for supplier selection and demand allocation under disruption risks

Shaohua Song, Elena Tappia, Guang Song, Xianliang Shi, T. C. E. Cheng

Summary: This study aims to optimize supplier selection and demand allocation decisions for omni-channel retailers in order to achieve supply chain resilience. It proposes a two-phase approach that takes into account various factors such as supplier evaluation and demand allocation.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

Accelerating Benders decomposition approach for shared parking spaces allocation considering parking unpunctuality and no-shows

Jinyan Hu, Yanping Jiang

Summary: This paper examines the allocation problem of shared parking spaces considering parking unpunctuality and no-shows. It proposes an effective approach using sample average approximation (SAA) combined with an accelerating Benders decomposition (ABD) algorithm to solve the problem. The numerical experiments demonstrate the significance of supply-demand balance for the operation and user satisfaction of the shared parking system.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Review Computer Science, Artificial Intelligence

Financial fraud detection using graph neural networks: A systematic review

Soroor Motie, Bijan Raahemi

Summary: Financial fraud is a persistent problem in the finance industry, but Graph Neural Networks (GNNs) have emerged as a powerful tool for detecting fraudulent activities. This systematic review provides a comprehensive overview of the current state-of-the-art technologies in using GNNs for financial fraud detection, identifies gaps and limitations in existing research, and suggests potential directions for future research.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Review Computer Science, Artificial Intelligence

Occluded person re-identification with deep learning: A survey and perspectives

Enhao Ning, Changshuo Wang, Huang Zhang, Xin Ning, Prayag Tiwari

Summary: This review provides a detailed overview of occluded person re-identification methods and conducts a systematic analysis and comparison of existing deep learning-based approaches. It offers important theoretical and practical references for future research in the field.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Article Computer Science, Artificial Intelligence

A hierarchical attention detector for bearing surface defect detection

Jiajun Ma, Songyu Hu, Jianzhong Fu, Gui Chen

Summary: The article presents a novel visual hierarchical attention detector for multi-scale defect location and classification, utilizing texture, semantic, and instance features of defects through a hierarchical attention mechanism, achieving multi-scale defect detection in bearing images with complex backgrounds.

EXPERT SYSTEMS WITH APPLICATIONS (2024)