Electrical and Computer Engineering, Department of

Department of Electrical and Computer Engineering: Faculty Publications

Deep Reinforcement Learning for Articulatory Synthesis in a Vowel-to-Vowel Imitation Task

Denis Shitov, RMIT University
Elena Pirogova, RMIT University
Tadeusz A. Wysocki, University of Nebraska-Lincoln, Bydgoszcz University of Science and TechnologyFollow
Margaret Lech, RMIT University

Document Type

Article

Date of this Version

3-24-2023

Citation

Shitov, D.; Pirogova, E.; Wysocki, T.A.; Lech, M. Deep Reinforcement Learning for Articulatory Synthesis in a Vowel-to-Vowel Imitation Task. Sensors 2023, 23, 3437. https://doi.org/10.3390/ s23073437

Comments

Open access.

Abstract

Articulatory synthesis is one of the approaches used for modeling human speech production. In this study, we propose a model-based algorithm for learning the policy to control the vocal tract of the articulatory synthesizer in a vowel-to-vowel imitation task. Our method does not require external training data, since the policy is learned through interactions with the vocal tract model. To improve the sample efficiency of the learning, we trained the model of speech production dynamics simultaneously with the policy. The policy was trained in a supervised way using predictions of the model of speech production dynamics. To stabilize the training, early stopping was incorporated into the algorithm. Additionally, we extracted acoustic features using an acoustic word embedding (AWE) model. This model was trained to discriminate between different words and to enable compact encoding of acoustics while preserving contextual information of the input. Our preliminary experiments showed that introducing this AWE model was crucial to guide the policy toward a near-optimal solution. The acoustic embeddings, obtained using the proposed approach, were revealed to be useful when applied as inputs to the policy and the model of speech production dynamics.

Download

Included in

Computer Engineering Commons, Electrical and Computer Engineering Commons

COinS

Electrical and Computer Engineering, Department of

Department of Electrical and Computer Engineering: Faculty Publications

Deep Reinforcement Learning for Articulatory Synthesis in a Vowel-to-Vowel Imitation Task

Document Type

Date of this Version

Citation

Comments

Abstract

Included in

Search

Browse

Author Corner

Links

Electrical and Computer Engineering, Department of

Department of Electrical and Computer Engineering: Faculty Publications

Deep Reinforcement Learning for Articulatory Synthesis in a Vowel-to-Vowel Imitation Task

Authors

Document Type

Date of this Version

Citation

Comments

Abstract

Included in

Share

Search

Browse

Author Corner

Links