4.7 Article

SuperstarGAN: Generative adversarial networks for image-to-image translation in large-scale domains

期刊

NEURAL NETWORKS
卷 162, 期 -, 页码 330-339

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2023.02.042

关键词

Generative adversarial networks; Image -to -image translation; Domain translation; Face image translation; Image generation

向作者/读者索取更多资源

Image-to-image translation with GANs is an active research area. StarGAN stands out by achieving multiple domain translation with a single generator, but it has limitations in learning large-scale domain mappings and expressing small feature changes. To overcome these limitations, an improved version called SuperstarGAN is proposed. It incorporates the idea of Controllable GAN and uses data augmentation techniques to handle overfitting. Evaluated with face image dataset, SuperstarGAN achieves better performance in terms of FID and LPIPS compared to StarGAN, and it can also control the degree of expression of target domain features in generated images.
Image-to-image translation with generative adversarial networks (GANs) has been extensively studied in recent years. Among the models, StarGAN has achieved image-to-image translation for multiple domains with a single generator, whereas conventional models require multiple generators. However, StarGAN has several limitations, including the lack of capacity to learn mappings among large-scale domains; furthermore, StarGAN can barely express small feature changes. To address the limitations, we propose an improved StarGAN, namely SuperstarGAN. We adopted the idea, first proposed in controllable GAN (ControlGAN), of training an independent classifier with the data augmentation techniques to handle the overfitting problem in the classification of StarGAN structures. Since the generator with a well-trained classifier can express small features belonging to the target domain, SuperstarGAN achieves image-to-image translation in large-scale domains. Evaluated with a face image dataset, SuperstarGAN demonstrated improved performance in terms of Frechet Inception distance (FID) and learned perceptual image patch similarity (LPIPS). Specifically, compared to StarGAN, SuperstarGAN exhibited decreased FID and LPIPS by 18.1% and 42.5%, respectively. Furthermore, we conducted an additional experiment with interpolated and extrapolated label values, indicating the ability of SuperstarGAN to control the degree of expression of the target domain features in generated images. Additionally, SuperstarGAN was successfully adapted to an animal face dataset and a painting dataset, where it can translate styles of animal faces (i.e., a cat to a tiger) and styles of painters (i.e., Hassam to Picasso), respectively, which explains the generality of SuperstarGAN regardless of datasets. (c) 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Review Biochemistry & Molecular Biology

Deep Learning Approaches for lncRNA-Mediated Mechanisms: A Comprehensive Review of Recent Developments

Yoojoong Kim, Minhyeok Lee

Summary: This review paper extensively analyzes the convergence of deep learning and long non-coding RNAs (lncRNAs) in a rapidly evolving field. It aims to provide a comprehensive examination of these intertwined research areas, given the recent advancements in deep learning and the increasing recognition of lncRNAs' importance. By scrutinizing the most recent research from 2021 to 2023, this paper offers valuable insights into how deep learning techniques are employed to investigate lncRNAs, contributing to the rapidly evolving field.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2023)

Article Mathematics

A Mathematical Investigation of Hallucination and Creativity in GPT Models

Minhyeok Lee

Summary: In this paper, a comprehensive mathematical analysis of hallucination phenomenon in GPT models is presented. Hallucination and creativity are rigorously defined and measured using concepts from probability theory and information theory. The trade-off between hallucination and creativity is characterized by introducing a parametric family of GPT models, and an optimal balance that maximizes model performance across various tasks is identified. This work offers a novel mathematical framework for understanding the origins and implications of hallucination in GPT models and facilitates future research and development in the field of large language models (LLMs).

MATHEMATICS (2023)

Review Mathematics

The Geometry of Feature Space in Deep Learning Models: A Holistic Perspective and Comprehensive Review

Minhyeok Lee

Summary: In this scholarly review, the comprehensive and holistic outlook on the geometry of feature spaces in deep learning models is provided, exploring the interconnections between feature spaces and influential factors such as activation functions, normalization methods, and model architectures. Various topics, including manifold structures, curvature, wide neural networks, critical points, and adversarial robustness, are discussed, along with transfer learning and disentangled representations. The challenges and future research directions in feature space geometry are outlined, emphasizing the importance of understanding overparameterized models, unsupervised and semi-supervised learning, interpretable feature space geometry, topological analysis, and multimodal and multi-task learning.

MATHEMATICS (2023)

Review Biology

Deep Learning Techniques with Genomic Data in Cancer Prognosis: A Comprehensive Review of the 2021-2023 Literature

Minhyeok Lee

Summary: This paper provides a comprehensive review of the advancements in deep learning for cancer survival prediction from 2021 to 2023. It highlights essential developments and their implications in the field through a careful selection of research papers and thorough analysis of prevailing trends. The paper aims to enhance our understanding of deep learning's potential in cancer survival analysis and guide future research directions.

BIOLOGY-BASEL (2023)

Article Mathematics, Applied

Multi-Task Deep Learning Games: Investigating Nash Equilibria and Convergence Properties

Minhyeok Lee

Summary: This paper provides a rigorous game-theoretic analysis of multi-task deep learning, focusing on the dynamics and interactions of tasks within these models. It highlights the importance of understanding the strategic behavior and convergence characteristics of tasks in a multi-task deep learning system. The findings contribute to the theoretical understanding and future research directions in the field of multi-task deep learning.

AXIOMS (2023)

Article Computer Science, Artificial Intelligence

ICEGAN: inverse covariance estimating generative adversarial network

Insoo Kim, Minhyeok Lee, Junhee Seok

Summary: Due to the recent explosive expansion of deep learning, various challenging problems have been tackled by deep learning methods. However, deep learning-based network estimation is limited in terms of fixed variables and the inability to use convolutional layers. In this study, we propose an ICEGAN method that addresses these limitations by modifying concepts from cycle-consistent adversarial networks and using the Monte Carlo approach. ICEGAN demonstrated superior performances in network estimation compared to conventional models and ordinary GAN models, and showed promising results in gene network estimation of breast cancer using a gene expression dataset.

MACHINE LEARNING-SCIENCE AND TECHNOLOGY (2023)

Article Computer Science, Artificial Intelligence

Deep learning model with L1 penalty for predicting breast cancer metastasis using gene expression data

Jaeyoon Kim, Minhyeok Lee, Junhee Seok

Summary: This paper proposes a novel deep neural network architecture, DeepCME, for predicting breast cancer metastasis using gene expression data. The model addresses the problem of overfitting by implementing regularization methods. Experimental results show that DeepCME achieves the highest average AUC scores in most cross-validation cases and outperforms other baseline models. Additionally, the study identifies 30 significant genes related to breast cancer metastasis. Based on these findings, DeepCME is expected to be clinically utilized for predicting breast cancer metastasis and further applied to other types of cancer.

MACHINE LEARNING-SCIENCE AND TECHNOLOGY (2023)

Article Automation & Control Systems

Portfolio optimization using predictive auxiliary classifier generative adversarial networks

Jiwook Kim, Minhyeok Lee

Summary: This paper proposes a novel portfolio weighting strategy that incorporates both risk and return considerations within a deep learning framework. By introducing the Predictive Auxiliary Classifier Generative Adversarial Networks (PredACGAN), the model is able to measure prediction uncertainty and optimize portfolios considering both return and risk. Empirical results show that the PredACGAN portfolios achieve higher returns and Sharpe ratios, with lower maximum drawdowns, highlighting their effectiveness.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2023)

Review Biology

Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review

Sanghyuk Roy Choi, Minhyeok Lee

Summary: The rapidly evolving field of deep learning, particularly transformer-based architectures and attention mechanisms, has extensive applications in bioinformatics and genome data analysis. This review critically evaluates the recent advancements and applications of these techniques in genome data analysis, discusses their advantages and limitations, and identifies potential future research areas.

BIOLOGY-BASEL (2023)

Article Environmental Sciences

Machine Learning Big Data Analysis of the Impact of Air Pollutants on Rhinitis-Related Hospital Visits

Soyeon Lee, Changwan Hyun, Minhyeok Lee

Summary: This study examines the complex relationship between various air pollutants and the incidence of rhinitis in Seoul, South Korea. Using a large dataset and machine learning techniques, the study found that CO and NO2 showed significant positive correlations with hospital visits for rhinitis, particularly with a 4-day lag. Interestingly, O-3 had mixed results, while PM10 and PM2.5 were both significantly correlated with different types of hospital visits, highlighting their potential to worsen rhinitis symptoms.

TOXICS (2023)

Article Mathematics

Mathematical Analysis and Performance Evaluation of the GELU Activation Function in Deep Learning

Minhyeok Lee

Summary: Selecting the most suitable activation function is crucial for deep learning models, and the Gaussian error linear unit (GELU) has emerged as a dominant method, surpassing traditional functions like ReLU. This study rigorously investigates the mathematical properties of GELU, including differentiability, boundedness, stationarity, and smoothness. Experimental comparisons on CIFAR-10, CIFAR-100, and STL-10 datasets demonstrate the superior performance of GELU compared to other functions, making it a suitable choice for various deep learning applications. This comprehensive study contributes to a deeper understanding of GELU's mathematical properties and provides valuable insights for practitioners in selecting optimal activation functions for deep learning.

JOURNAL OF MATHEMATICS (2023)

Review Biotechnology & Applied Microbiology

Recent Advancements in Deep Learning Using Whole Slide Imaging for Cancer Prognosis

Minhyeok Lee

Summary: This review provides an in-depth analysis of recent advancements in deep learning techniques applied to whole slide images (WSIs) for cancer prognosis. The combination of deep learning and the availability of WSIs shows great potential in revolutionizing predictive modeling for cancer prognosis. It is crucial to systematically review contemporary methodologies and critically evaluate their impact due to the rapid evolution and complexity of the field. This review aims to present a comprehensive overview of the current landscape, catalog major developments, assess their strengths and weaknesses, and provide insights into future directions.

BIOENGINEERING-BASEL (2023)

Review Mathematics

Recent Advances in Generative Adversarial Networks for Gene Expression Data: A Comprehensive Review

Minhyeok Lee

Summary: The evolving field of generative artificial intelligence, particularly generative deep learning, is revolutionizing various scientific and technological sectors. Generative adversarial networks (GANs) have emerged as a pivotal innovation in this domain and have shown remarkable capabilities in crafting synthetic data that closely resemble real-world distributions. The application of GANs to gene expression data systems has become a rapidly growing focus area, offering a potential solution to limitations related to ethical and logistical issues. This review provides a thorough analysis of recent advancements in the intersection of GANs and gene expression data from 2019 to 2023, serving as a key resource for academics and professionals in guiding subsequent research efforts and catalyzing growth in the discipline.

MATHEMATICS (2023)

Proceedings Paper Computer Science, Artificial Intelligence

FIDGAN: A Generative Adversarial Network with An Inception Distance

Jina Lee, Minhyeok Lee

Summary: Two evaluation metrics, Inception score (IS) and Frechet Inception distance (FID), have been proposed for GAN models. We introduce a novel GAN model that utilizes the backpropagation of FID score to guide efficient learning of real image distribution and generation of high-quality images. With training on the CIFAR-10 dataset, FIDGAN achieved an FID of 11.78, representing a 20.0% reduction compared to the existing model BigGAN.

2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC (2023)

Article Computer Science, Artificial Intelligence

Reduced-complexity Convolutional Neural Network in the compressed domain

Hamdan Abdellatef, Lina J. Karam

Summary: This paper proposes performing the learning and inference processes in the compressed domain to reduce computational complexity and improve speed of neural networks. Experimental results show that modified ResNet-50 in the compressed domain is 70% faster than traditional spatial-based ResNet-50 while maintaining similar accuracy. Additionally, a preprocessing step with partial encoding is suggested to improve resilience to distortions caused by low-quality encoded images. Training a network with highly compressed data can achieve good classification accuracy with significantly reduced storage requirements.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Theoretical limits on the speed of learning inverse models explain the rate of adaptation in arm reaching tasks

Victor R. Barradas, Yasuharu Koike, Nicolas Schweighofer

Summary: Inverse models are essential for human motor learning as they map desired actions to motor commands. The shape of the error surface and the distribution of targets in a task play a crucial role in determining the speed of learning.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Learning a robust foundation model against clean-label data poisoning attacks at downstream tasks

Ting Zhou, Hanshu Yan, Jingfeng Zhang, Lei Liu, Bo Han

Summary: We propose a defense strategy that reduces the success rate of data poisoning attacks in downstream tasks by pre-training a robust foundation model.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

AdaSAM: Boosting sharpness-aware minimization with adaptive learning rate and momentum for neural networks

Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, Dacheng Tao

Summary: In this paper, the convergence rate of AdaSAM in the stochastic non-convex setting is analyzed. Theoretical proof shows that AdaSAM has a linear speedup property and decouples the stochastic gradient steps with the adaptive learning rate and perturbed gradient. Experimental results demonstrate that AdaSAM outperforms other optimizers in terms of performance.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Grasping detection of dual manipulators based on Markov decision process with neural network

Juntong Yun, Du Jiang, Li Huang, Bo Tao, Shangchun Liao, Ying Liu, Xin Liu, Gongfa Li, Disi Chen, Baojia Chen

Summary: In this study, a dual manipulator grasping detection model based on the Markov decision process is proposed. By parameterizing the grasping detection model of dual manipulators using a cross entropy convolutional neural network and a full convolutional neural network, stable grasping of complex multiple objects is achieved. Robot grasping experiments were conducted to verify the feasibility and superiority of this method.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Asymmetric double networks mutual teaching for unsupervised person Re-identification

Miaohui Zhang, Kaifang Li, Jianxin Ma, Xile Wang

Summary: This paper proposes an unsupervised person re-identification (Re-ID) method that uses two asymmetric networks to generate pseudo-labels for each other by clustering and updates and optimizes the pseudo-labels through alternate training. It also designs similarity compensation and similarity suppression based on the camera ID of pedestrian images to optimize the similarity measure. Extensive experiments show that the proposed method achieves superior performance compared to state-of-the-art unsupervised person re-identification methods.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Low-variance Forward Gradients using Direct Feedback Alignment and momentum

Florian Bacho, Dominique Chu

Summary: This paper proposes a new approach called the Forward Direct Feedback Alignment algorithm for supervised learning in deep neural networks. By combining activity-perturbed forward gradients, direct feedback alignment, and momentum, this method achieves better performance and convergence speed compared to other local alternatives to backpropagation.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Maximum margin and global criterion based-recursive feature selection

Xiaojian Ding, Yi Li, Shilin Chen

Summary: This research paper addresses the limitations of recursive feature elimination (RFE) and its variants in high-dimensional feature selection tasks. The proposed algorithms, which introduce a novel feature ranking criterion and an optimal feature subset evaluation algorithm, outperform current state-of-the-art methods.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Mental image reconstruction from human brain activity: Neural decoding of mental imagery via deep neural network-based Bayesian estimation

Naoko Koide-Majima, Shinji Nishimoto, Kei Majima

Summary: Visual images observed by humans can be reconstructed from brain activity, and the visualization of arbitrary natural images from mental imagery has been achieved through an improved method. This study provides a unique tool for directly investigating the subjective contents of the brain.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Hierarchical attention network with progressive feature fusion for facial expression recognition

Huanjie Tao, Qianyue Duan

Summary: In this paper, a hierarchical attention network with progressive feature fusion is proposed for facial expression recognition (FER), addressing the challenges posed by pose variation, occlusions, and illumination variation. The model achieves enhanced performance by aggregating diverse features and progressively enhancing discriminative features.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

SLAPP: Subgraph-level attention-based performance prediction for deep learning models

Zhenyi Wang, Pengfei Yang, Linwei Hu, Bowen Zhang, Chengmin Lin, Wenkai Lv, Quan Wang

Summary: In the face of the complex landscape of deep learning, we propose a novel subgraph-level performance prediction method called SLAPP, which combines graph and operator features through an innovative graph neural network called EAGAT, providing accurate performance predictions. In addition, we introduce a mixed loss design with dynamic weight adjustment to improve predictive accuracy.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

LDCNet: Lightweight dynamic convolution network for laparoscopic procedures image segmentation

Yiyang Yin, Shuangling Luo, Jun Zhou, Liang Kang, Calvin Yu-Chian Chen

Summary: Medical image segmentation is crucial for modern healthcare systems, especially in reducing surgical risks and planning treatments. Transanal total mesorectal excision (TaTME) has become an important method for treating colon and rectum cancers. Real-time instance segmentation during TaTME surgeries can assist surgeons in minimizing risks. However, the dynamic variations in TaTME images pose challenges for accurate instance segmentation.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

start-stop points CenterNet for wideband signals detection and time-frequency localization in spectrum sensing

Teng Cheng, Lei Sun, Junning Zhang, Jinling Wang, Zhanyang Wei

Summary: This study proposes a scheme that combines the start-stop point signal features for wideband multi-signal detection, called Fast Spectrum-Size Self-Training network (FSSNet). By utilizing start-stop points to build the signal model, this method successfully solves the difficulty of existing deep learning methods in detecting discontinuous signals and achieves satisfactory detection speed.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Learning deep representation and discriminative features for clustering of multi-layer networks

Wenming Wu, Xiaoke Ma, Quan Wang, Maoguo Gong, Quanxue Gao

Summary: The layer-specific modules in multi-layer networks are critical for understanding the structure and function of the system. However, existing methods fail to accurately characterize and balance the connectivity and specificity of these modules. To address this issue, a joint learning graph clustering algorithm (DRDF) is proposed, which learns the deep representation and discriminative features of the multi-layer network, and balances the connectivity and specificity of the layer-specific modules through joint learning.

NEURAL NETWORKS (2024)

Article Computer Science, Artificial Intelligence

Boundary uncertainty aware network for automated polyp segmentation

Guanghui Yue, Guibin Zhuo, Weiqing Yan, Tianwei Zhou, Chang Tang, Peng Yang, Tianfu Wang

Summary: This paper proposes a novel boundary uncertainty aware network (BUNet) for precise and robust colorectal polyp segmentation. BUNet utilizes a pyramid vision transformer encoder to learn multi-scale features and incorporates a boundary exploration module (BEM) and a boundary uncertainty aware module (BUM) to handle boundary areas. Experimental results demonstrate that BUNet outperforms other methods in terms of performance and generalization ability.

NEURAL NETWORKS (2024)