Computer Science and Engineering, Department of

 

Date of this Version

5-7-2023

Citation

ACM ISBN 979-8-4007-0744-5/23/05. https://doi.org/10.1145/3589462.3589488

Comments

Used by permission.

Abstract

Speech recognition is difficult when the speech signal is weak or occurs in a noisy environment. This paper presents an efficient and robust method that can reconstruct the standard pronunciation of English phonemes and words given a weak or noisy signal. The reconstruction is based on a novel representation of the reconstruction task as a problem of data retrieval from a database in two different cases: (1) when the phonemes are represented in the database as binary tuples and the input is also a binary tuple from which deletion errors occur, and (2) when the phonemes are represented in the database and in the input as tuples of real values ranging between 0 and 1. In the latter case, the input phoneme could contain both a higher or lower value than the standard phoneme in the database that is intended by the speaker. For case (2) a theorem is proven regarding when the data retrieval can be expected to be reliable.

Share

COinS