Noise Robust Speech Recognition Activities in the Aalto Speech Group Based on Feature Enhancement and Uncertainty Modeling Approaches
Kalle Palomaeki
ICSI & Aalto University, Finland
Thursday, February 5, 2013
12:30 PM, Conference Room 5A
Abstract
This talk considers noise robust automatic speech recognition activities in the Aalto Speech Group. The two main approaches to noise robust ASR in our team are missing feature reconstruction and sound separation methods. The missing feature method is motivated by the psychoacoustic research according to which human listeners cope with noisy speech relying on clean, noise free spectral temporal glimpses of speech. Here, we reconstruct noisy regions based on models of clean speech statistics. In the sound separation approach noisy speech is modeled as a combination of exemplar spectra of clean speech and noises from respective speech and noise dictionaries. Then speech can be reconstructed from the exemplars that came from the speech dictionary. In both missing feature and sound separation approaches reconstructed speech practically always is inaccurate. That can be dealt with modeling uncertainty due to the reconstruction.
Speaker Bio
Kalle Palomaki is currently visiting ICSI from the Aalto University Finland. He is a senior researcher in the Speech Group in that is currently moving to Signal Processing and Acoustics in Aalto. He has a five-year research fellow position by the Academy of Finland and is leading the noise robust team in the Speech Group . He had his PhD in 2005 in Aalto, which contained rather divers research topics in computational auditory scene analysis models, noise robust ASR and auditory brain measurements.