Strategic Study of CAE >> 2006, Volume 8, Issue 3
Multi-band Synchronization Model for Speech Recognition Under Noisy Condition
Department of Radio Engineering , Southeast University , Nanjing 210096 , China
Next Previous
Abstract
Based on perception characteristic of human ear, this paper proposes synchronization multi-band maximum likelihood linear regression algorithm for robust speech recognition under noisy condition. The algorithm utilizes maximum likelihood as estimation criteria to compensate the effects of noisy condition with multi-band synchronization model and noise corruption assumption. The tests show that the proposed algorithm improves the performance of recognition system effectively.
Keywords
hidden Markov model ; maximum likelihood ; multi-band synchronization model ; speech recognition
References
[ 1 ] Acero A.Acoustical and Environmental Robustness inAutomatic Speech Recognition[M].Kluwer AcademicPress, Boston, MA, 1991
[ 2 ] Ephraim Y.Statistical-mode-based speech enhancementsystems[J].Proceedings of the IEEE, 1992, 80:1526~1555
[ 3 ] Gales M J F.Model-based Techniques for Noise RobustSpeech Recognition[D].Engineering Department, Cambridge University, Cambridge, UK, 1995
[ 4 ] Gales M, Young S.Cepstral parameter compensation forHMM recognition in noise[J].Computer Speech andLanguage, 1993, 12 (3) :231~239
[ 5 ] Sanker A, Lee C-H.Robust speech recognition based onstochastic matching[A].ICASSP’95, Vol 1[C].Detroit, Michigan, USA, 1995.121~124
[ 6 ] Leggetter C J, Woodland P C.Maximum likelihood linearregression for speaker adaption of continuous densityhidden markov models[J].Computer Speech andLanguage, 1995, 9:171~185
[ 7 ] Hermansky H.Perceptual linear predictive (PLP) analysisof speech[J].J Acoust Soc Am, 1990, 87:1738~1752
[ 8 ] Hermansky H, Morgan N.RASTA processing of speech[J].IEEE Trans On Speech Audio Processing, 1994, 2 (4) :578~589
[ 9 ] Tibrewala S, Hermansky H.Subband based recognition ofnoisy speech[A].ICASSP’97, vol.2[C].Munich, Germany, 1997.1255~1258
[10] 孙吴镇扬.基于独立感知理论的鲁棒语音识别算法[J].东南大学学报, 2005, 35 (4) :506~509 link1
[11] Bregman A S.Auditory Scene Analysis:the PerceptualOrganization of Sound[M].Cambridge, Massachusetts, The MIT Press, 1990
[12] Dempster A P, Laird N M, Rubin D B.Maximumlikelihood estimation from incomplete data[J].JournalRoyal Statistical Society, Serials B, 1977, 39 (1) :1~38
[13] Mak B.A mathematical relationship between fullband andmultiband mel-frequency cepstral coefficients[J].IEEESignal Processing Letters, 2002, 9:241~244