Study of non-linear frequency warping functions for speaker normalization

S. V.Bharath Kumar, S. Umesh, R. Sinha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In this paper, we study non-linear frequency-warping functions that are commonly used in speaker normalization. This study is motivated by our recently proposed affine transformation model for speaker normalization [1] which has provided improved recognition performance when compared to uniform scaling model [1, 2]. In this work, using formant data from Peterson & Barney and Hillenbrand vowel databases, we analyze the behavior of scale factor as a function of frequency. The empirical observation [3, 4] shows that while uniform scaling assumption may be valid at higher frequencies, there are significant deviations at low frequencies. We show that while our recently proposed model has behavior similar to the empirical result, the behavior of many of the commonly used non-linear models (including that of Eide-Gish, power law and bilinear transformation) differ significantly from the empirical result. This difference in behavior from the empirical observation may explain the limited improvement in recognition performance provided by these non-linear models when compared to conventional uniform-scaling model. We also show that our proposed model does better fitting to the formant data than these non-linear models. We, therefore, conclude that the affine-transformation model may be a more appropriate non-linear model for speaker normalization.

Original languageEnglish (US)
Title of host publication2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings
PagesI1245-I1248
StatePublished - 2006
Event2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 - Toulouse, France
Duration: May 14 2006May 19 2006

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
ISSN (Print)1520-6149

Other

Other2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006
Country/TerritoryFrance
CityToulouse
Period5/14/065/19/06

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Study of non-linear frequency warping functions for speaker normalization'. Together they form a unique fingerprint.

Cite this