Non-uniform speaker normalization using affine-transformation

S. V.Bharath Kumar, S. Umesh, Rohit Sinha

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations

Abstract

In this paper, we propose a mathematical model to describe the relation between the formant frequencies of speakers and show that with the proposed affine model, speaker differences separate out as translation factors when a "mel-like" warping is performed. Using speech data we estimate the parameters of this warping function and show that it is close to the usual mel-formula. This model is motivated by Rohit et al.'s shift-based non-uniform speaker-normalization method, which provides improvement over the conventional maximum-likelihood based speaker normalization methods. We therefore provide a unified framework that relates the relationship between formants of speakers and method of removing speakers difference (which involves mel-warping) in a neat mathematical framework which is substantiated by our recognition experiments.

Original languageEnglish (US)
Pages (from-to)I121-I124
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
StatePublished - 2004
EventProceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, Canada
Duration: May 17 2004May 21 2004

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Non-uniform speaker normalization using affine-transformation'. Together they form a unique fingerprint.

Cite this