4.5 Article

A study of lip movements during spontaneous dialog and its application to voice activity detection

期刊

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
卷 125, 期 2, 页码 1184-1196

出版社

ACOUSTICAL SOC AMER AMER INST PHYSICS
DOI: 10.1121/1.3050257

关键词

-

向作者/读者索取更多资源

This paper presents a quantitative and comprehensive study of the lip movements of a given speaker in different speech/nonspeech contexts, with a particular focus on silences (i.e., when no sound is produced by the speaker). The aim is to characterize the relationship between lip activity and speech activity and then to use visual speech information as a voice activity detector (VAD). To this aim, an original audiovisual corpus was recorded with two speakers involved in a face-to-face spontaneous dialog, although being in separate rooms. Each speaker communicated with the other using a microphone, a camera, a screen, and headphones. This system was used to capture separate audio stimuli for each speaker and to synchronously monitor the speaker's lip movements. A comprehensive analysis was carried out on the lip shapes and lip movements in either silence or nonsilence (i.e., speech+nonspeech audible events). A single visual parameter, defined to characterize the lip movements, was shown to be efficient for the detection of silence sections. This results in a visual VAD that can be used in any kind of environment noise, including intricate and highly nonstationary noises, e. g., multiple and/or moving noise sources or competing speech signals. (C) 2009 Acoustical Society of America. [DOI:10.1121/1.3050257]

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据