Article
Computer Science, Software Engineering
Monica Villanueva Aylagas, Hector Anadon Leon, Mattias Teye, Konrad Tollmar
Summary: Voice2Face is a deep learning model that generates face and tongue animations directly from recorded speech, with advantages different from previous works. Through user studies and quantitative evaluations, the superiority of Voice2Face in animation quality and accurate lip closure effects as well as good performance in data quality are demonstrated.
COMPUTER GRAPHICS FORUM
(2022)
Article
Computer Science, Software Engineering
Pengfei Liu, Qianwen Chao, Henwei Huang, Qiongyan Wang, Zhongyuan Zhao, Qi Peng, Milo K. Yip, Elvis S. Liu, Xiaogang Jin
Summary: The study introduces a novel velocity-based framework for dynamic crowd simulation, offering interactive control over crowd movements, simulating thousands of agents at interactive rates, and being both general and scalable for various robot navigation tests. Validation is performed through simulation experiments and comparisons to real-world data and existing crowd simulation methods.
Article
Computer Science, Software Engineering
Lucio Moser, Chinyu Chien, Mark Williams, Jose Serra, Darren Hendler, Doug Roble
Summary: The algorithm proposed in this study achieves automatic transfer of facial expressions between videos and 3D characters, as well as between different 3D characters. By learning common latent representations and establishing mapping relationships between images, expressions can be remapped between different characters unaffected by physiological differences. This technique can be applied to markerless motion capture and automatic facial animation transfer.
ACM TRANSACTIONS ON GRAPHICS
(2021)
Article
Computer Science, Software Engineering
He Chen, Hyojoon Park, Kutay Macit, Ladislav Kavan
Summary: The new method captures detailed human motion, outputs precise point coordinates with unique labels, and relies on 2D images only. It utilizes a special motion capture suit and neural networks to process images, making it easy to replicate and deploy. The method can accurately capture various human poses, including challenging motions like yoga and gymnastics.
ACM TRANSACTIONS ON GRAPHICS
(2021)
Article
Computer Science, Software Engineering
Jiali Chen, Changjie Fan, Zhimeng Zhang, Gongzheng Li, Zeng Zhao, Zhigang Deng, Yu Ding
Summary: In this article, a fully automatic, deep learning based framework is proposed to synthesize realistic upper body animations based on novel guzheng music input. The proposed approach utilizes a generative adversarial network (GAN) to capture the temporal relationship between the music and the human motion data. Extensive experiments show that the method can generate visually plausible guzheng-playing animations that are well synchronized with the input guzheng music, outperforming the state-of-the-art methods. Ablation study validates the contributions of the carefully-designed modules in the framework.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2023)
Article
Computer Science, Artificial Intelligence
Rizwan Sadiq, Engin Erzin
Summary: This article improves affective facial animations through domain adaptation and data augmentation. The proposed models show significant MSE loss improvements in experiments, and the resulting facial animations are preferred by subjects in subjective evaluations.
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
(2022)
Article
Computer Science, Software Engineering
Yilong Liu, Chengwei Zheng, Feng Xu, Xin Tong, Baining Guo
Summary: This article introduces a data-driven approach for modeling and animating 3D necks, which decomposes neck animation into local and global deformations, and utilizes a dataset to learn a neck model and regressor for driving the animation.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2021)
Article
Computer Science, Software Engineering
Jingying Liu, Binyuan Hui, Kun Li, Yunke Liu, Yu-Kun Lai, Yuxiang Zhang, Yebin Liu, Jingyu Yang
Summary: This paper proposes a Geometry-guided Dense Perspective Network (GDPnet) for achieving speaker-independent realistic 3D facial animation. The GDPnet utilizes a dense-connected encoder to strengthen feature propagation and audio feature reuse, and integrates an attention mechanism in the decoder for adaptive feature recalibration. The introduction of a non-linear face reconstruction representation improves accuracy of deformations and helps solve geometry-related issues.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2022)
Article
Computer Science, Software Engineering
Longwen Zhang, Chuxiao Zeng, Qixuan Zhang, Hongyang Lin, Ruixiang Cao, Wei Yang, Lan Xu, Jingyi Yu
Summary: This paper presents a new learning-based, video-driven approach for generating dynamic facial geometries with high-quality physically-based assets; modeling facial expressions, geometry, and physically-based textures using separate VAEs to preserve characteristics across respective attributes; comprehensive experiments show that this technique provides higher accuracy and visual fidelity in facial reconstruction and animation.
ACM TRANSACTIONS ON GRAPHICS
(2022)
Article
Computer Science, Software Engineering
Saeed Ghorbani, Ylva Ferstl, Daniel Holden, Nikolaus F. Troje, Marc-Andre Carbonneau
Summary: We introduce ZeroEGGS, a neural network framework for generating speech-driven gestures with zero-shot style control based on examples. Our model uses a Variational framework to learn style embeddings, enabling easy style modification. Through a series of experiments, we demonstrate the flexibility and generalizability of our model to new speakers and styles, and show its superiority in naturalness of motion, appropriateness for speech, and style portrayal compared to previous techniques. We also release a high-quality dataset for further research.
COMPUTER GRAPHICS FORUM
(2023)
Article
Computer Science, Software Engineering
Lucas Mourot, Ludovic Hoyet, Francois Le Clerc, Francois Schnitzler, Pierre Hellier
Summary: This article provides a comprehensive survey on the state-of-the-art approaches in skeleton-based human character animation using deep learning and deep reinforcement learning. It covers motion data representations, common datasets, as well as methods to enhance deep models for learning spatial and temporal patterns in motion data. The latest methods are divided into motion synthesis, character control, and motion editing categories, with a discussion on limitations and future research directions.
COMPUTER GRAPHICS FORUM
(2022)
Article
Computer Science, Software Engineering
Andreas Aristidou, Anastasios Yiannakidis, Kfir Aberman, Daniel Cohen-Or, Ariel Shamir, Yiorgos Chrysanthou
Summary: This work presents a music-driven motion synthesis framework that generates long-term sequences of human motions synchronized with the input beats, forming a global structure that respects a specific dance genre. The framework allows generation of diverse motions controlled by the content of the music, rather than just the beat. Results demonstrate the effectiveness of the framework in generating natural and consistent movements on various dance types, with control over the content of the synthesized motions and respect for the overall structure of the dance.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2023)
Article
Computer Science, Software Engineering
Levi Fussell, Kevin Bergamin, Daniel Holden
Summary: This paper demonstrates a method for motion tracking of physically simulated characters using supervised learning and optimizing the policy directly. By training a world model to approximate a specific subset of the environment's transition function, the policy can be optimized to minimize tracking error. Compared to popular model-free methods, this approach consistently achieves higher quality control in a shorter training time with reduced sensitivity to experience gathering rate, dataset size, and distribution.
ACM TRANSACTIONS ON GRAPHICS
(2021)
Article
Computer Science, Artificial Intelligence
Rongliang Wu, Yingchen Yu, Fangneng Zhan, Jiahui Zhang, Xiaoqin Zhang, Shijian Lu
Summary: This paper introduces a novel method called DIRFA, which can generate diverse and realistic facial animations for talking faces from the same driving audio. By designing a probabilistic mapping network, the audio signals can be autoregressively converted into a facial animation sequence, and a temporally-biased mask is introduced to model the temporal dependency of facial animations. Realistic talking faces can be synthesized using the generated facial animation sequence and a source image.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Software Engineering
Yong Zhao, Le Yang, Ercheng Pei, Meshia Cedric Oveneke, Mitchel Alioscha-Perez, Longfei Li, Dongmei Jiang, Hichem Sahli
Summary: This paper proposes a novel synthesis-by-analysis approach that leverages GAN framework and state-of-the-art AU detection model to achieve better AU-driven facial expression generation. By designing a novel discriminator architecture and introducing a balanced sampling approach, the experimental results show that our method outperforms the state-of-the-art in terms of realism and expressiveness of facial expressions.
COMPUTER GRAPHICS FORUM
(2021)