4.7 Article

Text-based Editing of Talking-head Video

期刊

ACM TRANSACTIONS ON GRAPHICS
卷 38, 期 4, 页码 -

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3306346.3323028

关键词

Text-based video editing; talking heads; visemes; dubbing; face tracking; face parameterization; neural rendering

资金

  1. Brown Institute for Media Innovation
  2. Max Planck Center for Visual Computing and Communications
  3. ERC Consolidator Grant 4DRepLy [770784]
  4. Adobe Systems
  5. Office of the Dean for Research at Princeton University

向作者/读者索取更多资源

Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts). Our method automatically annotates an input talking-head video with phonemes, visemes, 3D face pose and geometry, reflectance, expression and scene illumination per frame. To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material. The annotated parameters corresponding to the selected segments are seamlessly stitched together and used to produce an intermediate video representation in which the lower half of the face is rendered with a parametric face model. Finally, a recurrent video generation network transforms this representation to a photorealistic video that matches the edited transcript. We demonstrate a large variety of edits, such as the addition, removal, and alteration of words, as well as convincing language translation and full sentence synthesis.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据