☆ 4.5 Article

Reinforcement learning for parameter estimation in statistical spoken dialogue systems

COMPUTER SPEECH AND LANGUAGE (2012)

Journal

COMPUTER SPEECH AND LANGUAGE

Volume 26, Issue 3, Pages 168-192

Publisher

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

DOI: 10.1016/j.csl.2011.09.004

Keywords

Spoken dialogue systems; Reinforcement learning; POMDP; Dialogue management

Funding

EU [216594]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estimate the parameters of a dialogue policy which selects the system's responses based on the inferred dialogue state. However, the inference of the dialogue state itself depends on a dialogue model which describes the expected behaviour of a user when interacting with the system. Ideally the parameters of this dialogue model should be also optimised to maximise the expected cumulative reward. This article presents two novel reinforcement algorithms for learning the parameters of a dialogue model. First, the Natural Belief Critic algorithm is designed to optimise the model parameters while the policy is kept fixed. This algorithm is suitable, for example, in systems using a handcrafted policy, perhaps prescribed by other design considerations. Second, the Natural Actor and Belief Critic algorithm jointly optimises both the model and the policy parameters. The algorithms are evaluated on a statistical dialogue system modelled as a Partially Observable Markov Decision Process in a tourist information domain. The evaluation is performed with a user simulator and with real users. The experiments indicate that model parameters estimated to maximise the expected reward function provide improved performance compared to the baseline handcrafted parameters. (C) 2011 Elsevier Ltd. All rights reserved.

Reinforcement learning for parameter estimation in statistical spoken dialogue systems

Journal

COMPUTER SPEECH AND LANGUAGE

Publisher

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Reinforcement learning for parameter estimation in statistical spoken dialogue systems

Journal

COMPUTER SPEECH AND LANGUAGE

Publisher

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper