TY - GEN
T1 - Non-uniform speaker normalization using frequency-dependent scaling function
AU - Kumar, S. V.Bharath
AU - Umesh, S.
PY - 2004
Y1 - 2004
N2 - In this paper, we present improvements in the estimation of frequency-dependent scaling function for non-uniform speaker normalization when compared to the method in [1]. Further, unlike [1], using the estimated frequency-dependent scaling function, γ(f), we estimated the universal warping function that is necessary to separate out the speaker-dependent term as a translation factor and show that it is similar to mel-scale. Since, the proposed warping function is similar to mel-scale, we argue that our study "justifies" the usage of mel-scale in speech recognition, not only from the point of view of psychoacoustics but also from the view point of speaker normalization. Finally, in [2], we have assumed the commonly used formula for mel-scale, mel = 2595 log10 1 + f/700 for the universal warping function. In this paper, we have tried to fit a mel-like formula to the estimated universal warping function and use it to do non-uniform speaker normalization. We present the recognition results using these different universal warping functions with word error rate as the performance measure.
AB - In this paper, we present improvements in the estimation of frequency-dependent scaling function for non-uniform speaker normalization when compared to the method in [1]. Further, unlike [1], using the estimated frequency-dependent scaling function, γ(f), we estimated the universal warping function that is necessary to separate out the speaker-dependent term as a translation factor and show that it is similar to mel-scale. Since, the proposed warping function is similar to mel-scale, we argue that our study "justifies" the usage of mel-scale in speech recognition, not only from the point of view of psychoacoustics but also from the view point of speaker normalization. Finally, in [2], we have assumed the commonly used formula for mel-scale, mel = 2595 log10 1 + f/700 for the universal warping function. In this paper, we have tried to fit a mel-like formula to the estimated universal warping function and use it to do non-uniform speaker normalization. We present the recognition results using these different universal warping functions with word error rate as the performance measure.
UR - http://www.scopus.com/inward/record.url?scp=28244497099&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=28244497099&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:28244497099
SN - 0780386744
T3 - 2004 International Conference on Signal Processing and Communications, SPCOM
SP - 305
EP - 309
BT - 2004 International Conference on Signal Processing and Communications, SPCOM
T2 - 2004 International Conference on Signal Processing and Communications, SPCOM
Y2 - 11 December 2004 through 14 December 2004
ER -