☆ 4.5 Review

Statistical parametric speech synthesis

SPEECH COMMUNICATION (2009)

期刊

SPEECH COMMUNICATION

卷 51, 期 11, 页码 1039-1064

出版社

ELSEVIER

DOI: 10.1016/j.specom.2009.04.004

关键词

Speech synthesis; Unit selection; Hidden Markov models

类别

Acoustics Computer Science, Interdisciplinary Applications

资金

Ministry of Education, Culture, Sports, Science and Technology (MEXT)
Hori information science promotion foundation
JSPS [1880009]
European Community's Seventh Framework Programme [FP7/2007-2013]
US National Science Foundation [0415021]
Direct For Computer & Info Scie & Enginr
Div Of Information & Intelligent Systems [0415021] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

Reagent

摘要

This review gives a general overview of techniques used in statistical parametric speech synthesis. One instance of these techniques, called hidden Markov model (HMM)-based speech synthesis, has recently been demonstrated to be very effective in synthesizing acceptable speech. This review also contrasts these techniques with the more conventional technique of unit-selection synthesis that has dominated speech synthesis over the last decade. The advantages and drawbacks of statistical parametric synthesis are highlighted and we identify where we expect key developments to appear in the immediate future. (C) 2009 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

Feature Saliencies in Asymmetric Hidden Markov Models

Carlos Puerto-Santana, Pedro Larranaga, Concha Bielza

Summary: This article introduces asymmetric hidden Markov models with feature saliencies, which are capable of simultaneously determining relevant variables/features and probabilistic relationships between variables during their learning phase. Comparing with other approaches, the proposed models have better or equal fitness and provide further data insights.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Autoregressive Asymmetric Linear Gaussian Hidden Markov Models

Carlos Puerto-Santana, Pedro Larranaga, Concha Bielza

Summary: In a real-life process evolving over time, the relationship between relevant variables may change. Asymmetric hidden Markov models provide a dynamic framework where different inference models can be used for each state of the process. This paper modifies recent asymmetric hidden Markov models to incorporate an asymmetric autoregressive component for continuous variables, allowing the model to choose the optimal order of autoregression. The paper also demonstrates the adaptation of inference, hidden states decoding, and parameter learning for the proposed model. Experimental results with synthetic and real data showcase the capabilities of this new model.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

添加到收藏夹

Article Mathematics, Interdisciplinary Applications

Entangled Hidden Markov Models

Abdessatar Souissi, El Gheteb Soueidi

Summary: This paper aims to expand on previous research on quantum hidden Markov processes by introducing the concept of entangled hidden Markov processes. These are hidden Markov processes in which the hidden processes themselves are entangled Markov processes. The paper provides an explicit expression for the joint expectation of these processes and demonstrates that the approach also applies to the classical case.

CHAOS SOLITONS & FRACTALS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Maximum a Posteriori Approximation of Hidden Markov Models for Proportional Sequential Data Modeling With Simultaneous Feature Selection

Samr Ali, Nizar Bouguila

Summary: The hidden Markov model is a key generative machine learning approach in time series data analysis, with recent research focusing on parameter inference and feature selection, applied to tasks such as dynamic texture classification and infrared action recognition.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

添加到收藏夹

Article Physics, Multidisciplinary

Hidden Markov models with binary dependence

Ozgur Danisman, Umay Uzunoglu Kocer

Summary: Hidden Markov models are commonly used for modeling probabilistic structures with latent variables. They assume that observation symbols are conditionally independent and identically distributed, but in practice, this assumption may not always hold. The proposed model introduces a first-order Markov dependency between the current pair of hidden state-emitted observation symbol and the previous pair, which can better capture possible dependencies in real-life scenarios.

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Simultaneous positive sequential vectors modeling and unsupervised feature selection via continuous hidden Markov models

Wentao Fan, Ru Wang, Nizar Bouguila

Summary: This article introduces a method for modeling positive sequential vectors using continuous hidden Markov models. The method uses a mixture of inverted Dirichlet distributions as the emission density and incorporates an unsupervised localized feature selection method, allowing for both positive sequential data modeling and feature selection simultaneously.

PATTERN RECOGNITION (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A novel dynamic asset allocation system using Feature Saliency Hidden Markov models for smart beta investing

Elizabeth Fons, Paula Dawson, Jeffrey Yau, Xiao-jun Zeng, John Keane

Summary: The financial crisis of 2008 triggered interest in more transparent, rules-based portfolio construction strategies, with smart beta strategies becoming a trend among institutional investors. Researchers have utilized Hidden Markov Models (HMMs) to build a dynamic asset allocation system to manage short-term risk and proposed a smart beta allocation system based on the FSHMM algorithm, showing significant improvement in risk-adjusted returns.

EXPERT SYSTEMS WITH APPLICATIONS (2021)

添加到收藏夹

Article Ecology

Flexible hidden Markov models for behaviour-dependent habitat selection

N. J. Klappstein, L. Thomas, T. Michelot

Summary: The HMM-SSF method provides a versatile framework for analyzing behavior-specific habitat selection, allowing for simultaneous estimation of behavior transitions and habitat selection. It offers a more efficient and general approach compared to previous methods, and can be applied to a wide range of species and systems.

MOVEMENT ECOLOGY (2023)

添加到收藏夹

Article Biochemical Research Methods

Profile Hidden Markov Models Are Not Identifiable

Srilakshmi Pattabiraman, Tandy Warnow

Summary: Profile Hidden Markov Models (HMMs) are graphical models that generate finite length sequences from a distribution, widely used in bioinformatics. The construction of profile HMMs is a statistical estimation problem, and it is unknown whether they are statistically identifiable.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2021)

添加到收藏夹

Article Automation & Control Systems

Learning hidden Markov models from aggregate observations

Rahul Singh, Qinsheng Zhang, Yongxin Chen

Summary: In this paper, an algorithm is proposed for estimating the parameters of a time-homogeneous hidden Markov model (HMM) from aggregate observations. The algorithm is built upon the expectation-maximization algorithm and the aggregate inference algorithm, and it exhibits convergence guarantees for both discrete and continuous observations. When the population size is 1, the algorithm is equivalent to the standard Baum-Welch learning algorithm.

AUTOMATICA (2022)

添加到收藏夹

Article Automation & Control Systems

ICA and IVA bounded multivariate generalized Gaussian mixture based hidden Markov models

Ali H. Al-gumaei, Muhammad Azam, Manar Amayri, Nizar Bouguila

Summary: Machine learning, a branch of artificial intelligence, focuses on analyzing and interpreting patterns and structures in data to enable autonomous learning and decision-making. Hidden Markov models (HMMs) have recently experienced a resurgence in machine learning and are considered powerful probabilistic models. This paper integrates independent component analysis (ICA) and a bounded multivariate generalized Gaussian mixture model (ICA-BMGGMM) into the HMM approach, overcoming the limitation of assuming independent sources. The proposed models are validated with various applications and outperform other models in terms of performance metrics.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Sentiment analysis using novel and interpretable architectures of Hidden Markov Models

Isidoros Perikos, Spyridon Kardakis, Ioannis Hatzilygeroudis

Summary: The article introduces a novel, interpretable HMM-based method for recognizing sentiments in text, which has been tested under various architectures, training methods, orders, and ensembles, showing competitive performance and outperforming traditional HMM methods.

KNOWLEDGE-BASED SYSTEMS (2021)

添加到收藏夹

Article Management

Hidden markov models in reliability and maintenance

Maria Luz Gamiz, Nikolaos Limnios, Maria Del Carmen Segovia-Garcia

Summary: This paper discusses the use of hidden Markov models in reliability engineering, where the state of the system is not directly observable. The paper investigates the maximum-likelihood estimation of system reliability and the effectiveness of preventive maintenance strategies. Extensive simulation studies are conducted to evaluate the finite sample performance of the methodology.

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH (2023)

添加到收藏夹

Article Engineering, Electrical & Electronic

Subsurface Flow Path Modeling From Inertial Measurement Unit Sensor Data Using Infinite Hidden Markov Models

Laura Piho, Maarja Kruusmaa

Summary: Mapping subsurface flows is challenging due to their inaccessibility and complexity. This paper presents a method using infinite hidden Markov models and inertial measurement unit data to detect features and reconstruct subsurface flow paths. The method is validated on controlled examples and real-world datasets.

IEEE SENSORS JOURNAL (2022)

添加到收藏夹

Article Acoustics

Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery

Lucas Ondel, Bolaji Yusuf, Lukas Burget, Murat Saraclar

Summary: This study investigates subspace non-parametric models for learning a set of acoustic units from unlabeled speech recordings. By constraining the base-measure of a Dirichlet-Process mixture with a phonetic subspace estimated from other source languages, the learned acoustic units are forced to resemble phones of known source languages. Two models, the Subspace HMM (SHMM) and the Hierarchical-Subspace HMM (H-SHMM), are proposed and applied to three languages. The experimental results show that both subspace models outperform other systems in terms of clustering quality and segmentation accuracy.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2022)

添加到收藏夹

Review Engineering, Electrical & Electronic

Deep Learning for Acoustic Modeling in Parametric Speech Generation

Zhen-Hua Ling, Shi-Yin Kang, Heiga Zen, Andrew Senior, Mike Schuster, Xiao-Jun Qian, Helen Meng, Li Deng

IEEE SIGNAL PROCESSING MAGAZINE (2015)

添加到收藏夹

Article Acoustics

Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization

Heiga Zen, Norbert Braunschweiler, Sabine Buchholz, Mark J. F. Gales, Kate Knill, Sacha Krstulovic, Javier Latorre

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2012)

添加到收藏夹

Article Acoustics

Autoregressive Models for Statistical Parametric Speech Synthesis

Matt Shannon, Heiga Zen, William Byrne

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2013)

添加到收藏夹

Article Engineering, Electrical & Electronic

Speech Synthesis Based on Hidden Markov Models

Keiichi Tokuda, Yoshihiko Nankaku, Tomoki Toda, Heiga Zen, Junichi Yamagishi, Keiichiro Oura

PROCEEDINGS OF THE IEEE (2013)

添加到收藏夹

Article Engineering, Electrical & Electronic

Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques

Reinhold Haeb-Umbach, Shinji Watanabe, Tomohiro Nakatani, Michiel Bacchiani, Bjoern Hoffmeister, Michael L. Seltzer, Heiga Zen, Mehrez Souden

IEEE SIGNAL PROCESSING MAGAZINE (2019)

添加到收藏夹

Proceedings Paper Audiology & Speech-Language Pathology

PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS

Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu

Summary: PnG BERT is a new encoder model for neural TTS that incorporates phoneme and grapheme representations as input, resulting in more natural prosody and accurate pronunciation. Experimental results demonstrate that a neural TTS model pre-trained with PnG BERT outperforms baseline models.

INTERSPEECH 2021 (2021)

添加到收藏夹

Proceedings Paper Audiology & Speech-Language Pathology

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan

Summary: WaveGrad 2 is a non-autoregressive generative model for text-to-speech synthesis that generates high fidelity audio through an iterative refinement process and allows for a trade-off between inference speed and sample quality by adjusting the number of refinement steps. Experiments show that it approaches the performance of state-of-the-art neural TTS systems.

INTERSPEECH 2021 (2021)

添加到收藏夹

Proceedings Paper Audiology & Speech-Language Pathology

Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation

Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno

Summary: Semi and self-supervised training techniques can improve speech recognition performance without additional transcribed speech data. This study demonstrates the efficacy of two approaches by leveraging unspoken text and untranscribed audio, reducing word error rate in Indic language voice search tasks by up to 14.4%.

INTERSPEECH 2021 (2021)

添加到收藏夹

Proceedings Paper Audiology & Speech-Language Pathology

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R. J. Skerry-Ryan, Yonghui Wu

Summary: This paper introduces Parallel Tacotron 2, a non-autoregressive neural text-to-speech model with a fully differentiable duration model that can learn token-frame alignments and durations automatically. Experimental results show that Parallel Tacotron 2 outperforms baselines in subjective naturalness in several diverse multi-speaker evaluations.

INTERSPEECH 2021 (2021)

添加到收藏夹

Proceedings Paper Acoustics

FULLY-HIERARCHICAL FINE-GRAINED PROSODY MODELING FOR INTERPRETABLE SPEECH SYNTHESIS

Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (2020)

添加到收藏夹

Proceedings Paper Acoustics

GENERATING DIVERSE AND NATURAL TEXT-TO-SPEECH SAMPLES USING A QUANTIZED FINE-GRAINED VAE AND AUTOREGRESSIVE PROSODY PRIOR

Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (2020)

添加到收藏夹

Proceedings Paper Computer Science, Artificial Intelligence

Sequence-to-Sequence Neural Network Model with 2D Attention for Learning Japanese Pitch Accents

Antoine Bruguier, Heiga Zen, Arkady Arkhangorodsky

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES (2018)

添加到收藏夹

Proceedings Paper Acoustics

Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices

Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, Przemyslaw Szczepaniak

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES (2016)

添加到收藏夹

Proceedings Paper Acoustics

Multi-Language Multi-Speaker Acoustic Modeling for LSTM-RNN based Statistical Parametric Speech Synthesis

Bo Li, Heiga Zen

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES (2016)

添加到收藏夹

Proceedings Paper Acoustics

DIRECTLY MODELING VOICED AND UNVOICED COMPONENTS IN SPEECH WAVEFORMS BY NEURAL NETWORKS

Keiichi Tokuda, Heiga Zen

2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS (2016)

添加到收藏夹

Article Acoustics

Compact deep neural networks for real-time speech enhancement on resource-limited devices

Fazal E. Wahab, Zhongfu Ye, Nasir Saleem, Rizwan Ullah

Summary: This study presents a compact neural model designed in a complex frequency domain for real-time speech enhancement. The proposed model outperforms benchmark models and improves speech quality and intelligibility. The incorporation of attention-gate-based skip connections further enhances the performance.

SPEECH COMMUNICATION (2024)

添加到收藏夹

© Peeref 2019-2024. All rights reserved.