4.7 Article Proceedings Paper

JALI: An Animator-Centric Viseme Model for Expressive Lip Synchronization

Journal

ACM TRANSACTIONS ON GRAPHICS
Volume 35, Issue 4, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/2897824.2925984

Keywords

facial animation; procedural animation; lip synchronization; speech synchronization; audio-visual speech

Ask authors/readers for more resources

The rich signals we extract from facial expressions imposes high expectations for the science and art of facial animation. While the advent of high-resolution performance capture has greatly improved realism, the utility of procedural animation warrants a prominent place in facial animation workflow. We present a system that, given an input audio soundtrack and speech transcript, automatically generates expressive lip-synchronized facial animation that is amenable to further artistic refinement, and that is comparable with both performance capture and professional animator output. Because of the diversity of ways we produce sound, the mapping from phonemes to visual depictions as visemes is many-valued. We draw from psycholinguistics to capture this variation using two visually distinct anatomical actions: Jaw and Lip, where sound is primarily controlled by jaw articulation and lower-face muscles, respectively. We describe the construction of a transferable template JALI 3D facial rig, built upon the popular facial muscle action unit representation FACS. We show that acoustic properties in a speech signal map naturally to the dynamic degree of jaw and lip in visual speech. We provide an array of compelling animation clips, compare against performance capture and existing procedural animation, and report on a brief user study.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available