Article
Computer Science, Information Systems
Song Sun, Bo Zhao, Muhammad Mateen, Xin Chen, Junhao Wen
Summary: Recent studies have proposed an end-to-end learning framework for generating diverse, realistic and controllable face images guided by face masks. By using a style encoder, generator and discriminator, the proposed model can generate face images with different styles based on the input face mask and fine control the generated face image by manipulating the face mask.
FRONTIERS OF COMPUTER SCIENCE
(2022)
Article
Computer Science, Information Systems
Muhammad Umair Hassan, Saleh Alaliyat, Ibrahim A. Hameed
Summary: An image is a powerful way to convey the meaning and essence of complex topics, ideas, and concepts. Teaching computers how to recognize and generate images is crucial in computer vision. Generating controlled and complex images based on scene graphs and layouts is a challenging task, but has been made easier with advances in generative modeling. The performance of scene graph and scene layout-based image generation models can be evaluated using a standard methodology, and experimental results show that scene layout-based image generation outperforms graph-based generation in most evaluations.
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES
(2023)
Article
Computer Science, Software Engineering
Luying Li, Junshu Tang, Zhiwen Shao, Xin Tan, Lizhuang Ma
Summary: This work introduces a two-stage sketch-to-photo generative adversarial network for face generation, utilizing semantic loss, color refinement loss, multi-scale discriminator, and other techniques to enhance the details and quality of synthesized images.
Article
Computer Science, Artificial Intelligence
Qi Mao, Hung-Yu Tseng, Hsin-Ying Lee, Jia-Bin Huang, Siwei Ma, Ming-Hsuan Yang
Summary: The paper presents an effective method for continuous translation between different domains using signed attribute vectors. It generates a smooth sequence of intermediate results and utilizes adversarial training to enhance the visual quality of translation. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods in image translation tasks.
INTERNATIONAL JOURNAL OF COMPUTER VISION
(2022)
Article
Computer Science, Theory & Methods
Long Zhang, Lin Zhao
Summary: This study proposes a generation antagonism network based on the PSO algorithm to improve training stability and address difficulties in GANs, by improving the inertia weight of particle swarm and judging the aggregation degree of particles to ensure optimization ability and diversity.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
(2021)
Article
Computer Science, Information Systems
Ziqiang Zheng, Yi Bin, Xiaoou Lv, Yang Wu, Yang Yang, Heng Tao Shen
Summary: The study proposes an asynchronous generative adversarial network called Async-GAN, which addresses the problem of asymmetric unpaired image-to-image translation. It iteratively builds gradually improving intermediate domains to generate pseudo paired training samples, providing stronger full supervision to assist in the translation from the information-poor domain to the information-rich domain.
IEEE TRANSACTIONS ON MULTIMEDIA
(2023)
Article
Environmental Sciences
Biao Wang, Lingxuan Zhu, Xing Guo, Xiaobing Wang, Jiaji Wu
Summary: This paper proposes an approach based on the shared latent domain hypothesis and generation adversarial network for generating spectral remote sensing images of the Earth's background. By mining the correlation between spectra, this method improves the accuracy and geographic precision of image generation.
Article
Computer Science, Information Systems
Junyu Lin, Xuemeng Song, Tian Gan, Yiyang Yao, Weifeng Liu, Liqiang Nie
Summary: The study focused on generating clothing items from online fashion blogs, introducing the PaintNet framework and conducting experiments on the Lookbook dataset to verify its effectiveness. Results showed that PaintNet performed well in clothing generation and cross-domain clothing retrieval tasks.
MULTIMEDIA TOOLS AND APPLICATIONS
(2021)
Article
Computer Science, Artificial Intelligence
Che-Tsung Lin, Jie-Long Kew, Chee Seng Chan, Shang -Hong Lai, Christopher Zach
Summary: Recent advances in GANs have shown promising results in domain adaptation for object detectors through data augmentation. However, existing methods that preserve objects well in image-to-image translation often require pixel-level annotations or object detectors at test time. This work proposes AugGAN-Det, which utilizes Cycle-object Consistency (CoCo) loss to generate instance-aware translated images across complex domains. The model outperforms previous models in terms of object preservation, instance-level translation, detection accuracy, and visual perceptual quality, without the need for explicit feature alignment or a detector at test time.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Theory & Methods
Zhizhong Huang, Shouzhen Chen, Junping Zhang, Hongming Shan
Summary: The paper introduces a novel progressive face aging framework based on generative adversarial network (PFA-GAN) to address the issues of existing methods. Unlike traditional single-network approaches, the proposed framework contains multiple sub-networks to mimic the gradual aging process of faces and achieve better aging accuracy and image quality. The framework can be trained end-to-end to eliminate artifacts and blur, and introduces an age estimation loss for improved accuracy, with the Pearson correlation coefficient used as an evaluation metric for aging smoothness.
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY
(2021)
Article
Computer Science, Information Systems
Koya Tango, Marie Katsurai, Hayato Maki, Ryosuke Goto
Summary: This paper presents a method for automatic generation of cosplay images based on image-to-image translation. By reinterpreting animated images as real garments, this method is able to produce diverse and realistic cosplay images. Experiments show that the proposed method performs better in generating cosplay images.
MULTIMEDIA TOOLS AND APPLICATIONS
(2022)
Article
Computer Science, Interdisciplinary Applications
Aldo Marzullo, Sara Moccia, Michele Catellani, Francesco Calimeri, Elena De Momi
Summary: Deep Learning has made significant advancements in medical imaging, but faces challenges in the surgery field due to a lack of large-scale data. This study presents a method for MIS image synthesis using generative adversarial networks to generate realistic surgical images.
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE
(2021)
Article
Computer Science, Information Systems
Suli Li, Hyo Jong Lee
Summary: A gender-preserving face aging model (GFAM) is proposed to address the issue of personalized patterns being overlooked in existing methods. GFAM utilizes a generative adversarial network and includes subnetworks to simulate aging effects. The model introduces a gender classifier, a gender loss function, and an identity-preserving module to maintain gender attributes and identity information of synthetic faces.
Article
Computer Science, Artificial Intelligence
Yuxin Ding, Longfei Wang
Summary: Image-to-image translation involves translating images from one domain to another, with CycleGAN being a model that can do this without paired training data. To reduce complexity, a shared hidden space storing common features can be used, along with a common encoder to learn these features.
NEURAL COMPUTING & APPLICATIONS
(2022)
Article
Computer Science, Artificial Intelligence
Deyin Liu, Lin Wu, Feng Zheng, Lingqiao Liu, Meng Wang
Summary: Person image generation conditioned on natural language allows us to personalize image editing in a user-friendly manner. We propose a novel pose-guided multi-granularity attention architecture to synthesize person images. By incorporating sentence-level description and pose feature maps, we generate a coarse person image and further enhance it by drawing human body parts with highly correlated textual nouns and determining the spatial positions with respect to target pose points.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Review
Biochemistry & Molecular Biology
Yoojoong Kim, Minhyeok Lee
Summary: This review paper extensively analyzes the convergence of deep learning and long non-coding RNAs (lncRNAs) in a rapidly evolving field. It aims to provide a comprehensive examination of these intertwined research areas, given the recent advancements in deep learning and the increasing recognition of lncRNAs' importance. By scrutinizing the most recent research from 2021 to 2023, this paper offers valuable insights into how deep learning techniques are employed to investigate lncRNAs, contributing to the rapidly evolving field.
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2023)
Article
Mathematics
Minhyeok Lee
Summary: In this paper, a comprehensive mathematical analysis of hallucination phenomenon in GPT models is presented. Hallucination and creativity are rigorously defined and measured using concepts from probability theory and information theory. The trade-off between hallucination and creativity is characterized by introducing a parametric family of GPT models, and an optimal balance that maximizes model performance across various tasks is identified. This work offers a novel mathematical framework for understanding the origins and implications of hallucination in GPT models and facilitates future research and development in the field of large language models (LLMs).
Review
Mathematics
Minhyeok Lee
Summary: In this scholarly review, the comprehensive and holistic outlook on the geometry of feature spaces in deep learning models is provided, exploring the interconnections between feature spaces and influential factors such as activation functions, normalization methods, and model architectures. Various topics, including manifold structures, curvature, wide neural networks, critical points, and adversarial robustness, are discussed, along with transfer learning and disentangled representations. The challenges and future research directions in feature space geometry are outlined, emphasizing the importance of understanding overparameterized models, unsupervised and semi-supervised learning, interpretable feature space geometry, topological analysis, and multimodal and multi-task learning.
Review
Biology
Minhyeok Lee
Summary: This paper provides a comprehensive review of the advancements in deep learning for cancer survival prediction from 2021 to 2023. It highlights essential developments and their implications in the field through a careful selection of research papers and thorough analysis of prevailing trends. The paper aims to enhance our understanding of deep learning's potential in cancer survival analysis and guide future research directions.
Article
Mathematics, Applied
Minhyeok Lee
Summary: This paper provides a rigorous game-theoretic analysis of multi-task deep learning, focusing on the dynamics and interactions of tasks within these models. It highlights the importance of understanding the strategic behavior and convergence characteristics of tasks in a multi-task deep learning system. The findings contribute to the theoretical understanding and future research directions in the field of multi-task deep learning.
Article
Computer Science, Artificial Intelligence
Insoo Kim, Minhyeok Lee, Junhee Seok
Summary: Due to the recent explosive expansion of deep learning, various challenging problems have been tackled by deep learning methods. However, deep learning-based network estimation is limited in terms of fixed variables and the inability to use convolutional layers. In this study, we propose an ICEGAN method that addresses these limitations by modifying concepts from cycle-consistent adversarial networks and using the Monte Carlo approach. ICEGAN demonstrated superior performances in network estimation compared to conventional models and ordinary GAN models, and showed promising results in gene network estimation of breast cancer using a gene expression dataset.
MACHINE LEARNING-SCIENCE AND TECHNOLOGY
(2023)
Article
Computer Science, Artificial Intelligence
Jaeyoon Kim, Minhyeok Lee, Junhee Seok
Summary: This paper proposes a novel deep neural network architecture, DeepCME, for predicting breast cancer metastasis using gene expression data. The model addresses the problem of overfitting by implementing regularization methods. Experimental results show that DeepCME achieves the highest average AUC scores in most cross-validation cases and outperforms other baseline models. Additionally, the study identifies 30 significant genes related to breast cancer metastasis. Based on these findings, DeepCME is expected to be clinically utilized for predicting breast cancer metastasis and further applied to other types of cancer.
MACHINE LEARNING-SCIENCE AND TECHNOLOGY
(2023)
Article
Automation & Control Systems
Jiwook Kim, Minhyeok Lee
Summary: This paper proposes a novel portfolio weighting strategy that incorporates both risk and return considerations within a deep learning framework. By introducing the Predictive Auxiliary Classifier Generative Adversarial Networks (PredACGAN), the model is able to measure prediction uncertainty and optimize portfolios considering both return and risk. Empirical results show that the PredACGAN portfolios achieve higher returns and Sharpe ratios, with lower maximum drawdowns, highlighting their effectiveness.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Review
Biology
Sanghyuk Roy Choi, Minhyeok Lee
Summary: The rapidly evolving field of deep learning, particularly transformer-based architectures and attention mechanisms, has extensive applications in bioinformatics and genome data analysis. This review critically evaluates the recent advancements and applications of these techniques in genome data analysis, discusses their advantages and limitations, and identifies potential future research areas.
Article
Environmental Sciences
Soyeon Lee, Changwan Hyun, Minhyeok Lee
Summary: This study examines the complex relationship between various air pollutants and the incidence of rhinitis in Seoul, South Korea. Using a large dataset and machine learning techniques, the study found that CO and NO2 showed significant positive correlations with hospital visits for rhinitis, particularly with a 4-day lag. Interestingly, O-3 had mixed results, while PM10 and PM2.5 were both significantly correlated with different types of hospital visits, highlighting their potential to worsen rhinitis symptoms.
Article
Mathematics
Minhyeok Lee
Summary: Selecting the most suitable activation function is crucial for deep learning models, and the Gaussian error linear unit (GELU) has emerged as a dominant method, surpassing traditional functions like ReLU. This study rigorously investigates the mathematical properties of GELU, including differentiability, boundedness, stationarity, and smoothness. Experimental comparisons on CIFAR-10, CIFAR-100, and STL-10 datasets demonstrate the superior performance of GELU compared to other functions, making it a suitable choice for various deep learning applications. This comprehensive study contributes to a deeper understanding of GELU's mathematical properties and provides valuable insights for practitioners in selecting optimal activation functions for deep learning.
JOURNAL OF MATHEMATICS
(2023)
Review
Biotechnology & Applied Microbiology
Minhyeok Lee
Summary: This review provides an in-depth analysis of recent advancements in deep learning techniques applied to whole slide images (WSIs) for cancer prognosis. The combination of deep learning and the availability of WSIs shows great potential in revolutionizing predictive modeling for cancer prognosis. It is crucial to systematically review contemporary methodologies and critically evaluate their impact due to the rapid evolution and complexity of the field. This review aims to present a comprehensive overview of the current landscape, catalog major developments, assess their strengths and weaknesses, and provide insights into future directions.
BIOENGINEERING-BASEL
(2023)
Review
Mathematics
Minhyeok Lee
Summary: The evolving field of generative artificial intelligence, particularly generative deep learning, is revolutionizing various scientific and technological sectors. Generative adversarial networks (GANs) have emerged as a pivotal innovation in this domain and have shown remarkable capabilities in crafting synthetic data that closely resemble real-world distributions. The application of GANs to gene expression data systems has become a rapidly growing focus area, offering a potential solution to limitations related to ethical and logistical issues. This review provides a thorough analysis of recent advancements in the intersection of GANs and gene expression data from 2019 to 2023, serving as a key resource for academics and professionals in guiding subsequent research efforts and catalyzing growth in the discipline.
Proceedings Paper
Computer Science, Artificial Intelligence
Jina Lee, Minhyeok Lee
Summary: Two evaluation metrics, Inception score (IS) and Frechet Inception distance (FID), have been proposed for GAN models. We introduce a novel GAN model that utilizes the backpropagation of FID score to guide efficient learning of real image distribution and generation of high-quality images. With training on the CIFAR-10 dataset, FIDGAN achieved an FID of 11.78, representing a 20.0% reduction compared to the existing model BigGAN.
2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC
(2023)
Article
Computer Science, Artificial Intelligence
Hamdan Abdellatef, Lina J. Karam
Summary: This paper proposes performing the learning and inference processes in the compressed domain to reduce computational complexity and improve speed of neural networks. Experimental results show that modified ResNet-50 in the compressed domain is 70% faster than traditional spatial-based ResNet-50 while maintaining similar accuracy. Additionally, a preprocessing step with partial encoding is suggested to improve resilience to distortions caused by low-quality encoded images. Training a network with highly compressed data can achieve good classification accuracy with significantly reduced storage requirements.
Article
Computer Science, Artificial Intelligence
Victor R. Barradas, Yasuharu Koike, Nicolas Schweighofer
Summary: Inverse models are essential for human motor learning as they map desired actions to motor commands. The shape of the error surface and the distribution of targets in a task play a crucial role in determining the speed of learning.
Article
Computer Science, Artificial Intelligence
Ting Zhou, Hanshu Yan, Jingfeng Zhang, Lei Liu, Bo Han
Summary: We propose a defense strategy that reduces the success rate of data poisoning attacks in downstream tasks by pre-training a robust foundation model.
Article
Computer Science, Artificial Intelligence
Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, Dacheng Tao
Summary: In this paper, the convergence rate of AdaSAM in the stochastic non-convex setting is analyzed. Theoretical proof shows that AdaSAM has a linear speedup property and decouples the stochastic gradient steps with the adaptive learning rate and perturbed gradient. Experimental results demonstrate that AdaSAM outperforms other optimizers in terms of performance.
Article
Computer Science, Artificial Intelligence
Juntong Yun, Du Jiang, Li Huang, Bo Tao, Shangchun Liao, Ying Liu, Xin Liu, Gongfa Li, Disi Chen, Baojia Chen
Summary: In this study, a dual manipulator grasping detection model based on the Markov decision process is proposed. By parameterizing the grasping detection model of dual manipulators using a cross entropy convolutional neural network and a full convolutional neural network, stable grasping of complex multiple objects is achieved. Robot grasping experiments were conducted to verify the feasibility and superiority of this method.
Article
Computer Science, Artificial Intelligence
Miaohui Zhang, Kaifang Li, Jianxin Ma, Xile Wang
Summary: This paper proposes an unsupervised person re-identification (Re-ID) method that uses two asymmetric networks to generate pseudo-labels for each other by clustering and updates and optimizes the pseudo-labels through alternate training. It also designs similarity compensation and similarity suppression based on the camera ID of pedestrian images to optimize the similarity measure. Extensive experiments show that the proposed method achieves superior performance compared to state-of-the-art unsupervised person re-identification methods.
Article
Computer Science, Artificial Intelligence
Florian Bacho, Dominique Chu
Summary: This paper proposes a new approach called the Forward Direct Feedback Alignment algorithm for supervised learning in deep neural networks. By combining activity-perturbed forward gradients, direct feedback alignment, and momentum, this method achieves better performance and convergence speed compared to other local alternatives to backpropagation.
Article
Computer Science, Artificial Intelligence
Xiaojian Ding, Yi Li, Shilin Chen
Summary: This research paper addresses the limitations of recursive feature elimination (RFE) and its variants in high-dimensional feature selection tasks. The proposed algorithms, which introduce a novel feature ranking criterion and an optimal feature subset evaluation algorithm, outperform current state-of-the-art methods.
Article
Computer Science, Artificial Intelligence
Naoko Koide-Majima, Shinji Nishimoto, Kei Majima
Summary: Visual images observed by humans can be reconstructed from brain activity, and the visualization of arbitrary natural images from mental imagery has been achieved through an improved method. This study provides a unique tool for directly investigating the subjective contents of the brain.
Article
Computer Science, Artificial Intelligence
Huanjie Tao, Qianyue Duan
Summary: In this paper, a hierarchical attention network with progressive feature fusion is proposed for facial expression recognition (FER), addressing the challenges posed by pose variation, occlusions, and illumination variation. The model achieves enhanced performance by aggregating diverse features and progressively enhancing discriminative features.
Article
Computer Science, Artificial Intelligence
Zhenyi Wang, Pengfei Yang, Linwei Hu, Bowen Zhang, Chengmin Lin, Wenkai Lv, Quan Wang
Summary: In the face of the complex landscape of deep learning, we propose a novel subgraph-level performance prediction method called SLAPP, which combines graph and operator features through an innovative graph neural network called EAGAT, providing accurate performance predictions. In addition, we introduce a mixed loss design with dynamic weight adjustment to improve predictive accuracy.
Article
Computer Science, Artificial Intelligence
Yiyang Yin, Shuangling Luo, Jun Zhou, Liang Kang, Calvin Yu-Chian Chen
Summary: Medical image segmentation is crucial for modern healthcare systems, especially in reducing surgical risks and planning treatments. Transanal total mesorectal excision (TaTME) has become an important method for treating colon and rectum cancers. Real-time instance segmentation during TaTME surgeries can assist surgeons in minimizing risks. However, the dynamic variations in TaTME images pose challenges for accurate instance segmentation.
Article
Computer Science, Artificial Intelligence
Teng Cheng, Lei Sun, Junning Zhang, Jinling Wang, Zhanyang Wei
Summary: This study proposes a scheme that combines the start-stop point signal features for wideband multi-signal detection, called Fast Spectrum-Size Self-Training network (FSSNet). By utilizing start-stop points to build the signal model, this method successfully solves the difficulty of existing deep learning methods in detecting discontinuous signals and achieves satisfactory detection speed.
Article
Computer Science, Artificial Intelligence
Wenming Wu, Xiaoke Ma, Quan Wang, Maoguo Gong, Quanxue Gao
Summary: The layer-specific modules in multi-layer networks are critical for understanding the structure and function of the system. However, existing methods fail to accurately characterize and balance the connectivity and specificity of these modules. To address this issue, a joint learning graph clustering algorithm (DRDF) is proposed, which learns the deep representation and discriminative features of the multi-layer network, and balances the connectivity and specificity of the layer-specific modules through joint learning.
Article
Computer Science, Artificial Intelligence
Guanghui Yue, Guibin Zhuo, Weiqing Yan, Tianwei Zhou, Chang Tang, Peng Yang, Tianfu Wang
Summary: This paper proposes a novel boundary uncertainty aware network (BUNet) for precise and robust colorectal polyp segmentation. BUNet utilizes a pyramid vision transformer encoder to learn multi-scale features and incorporates a boundary exploration module (BEM) and a boundary uncertainty aware module (BUM) to handle boundary areas. Experimental results demonstrate that BUNet outperforms other methods in terms of performance and generalization ability.