Noise Robust Speech Recognition Activities in the Aalto Speech Group Based on Feature Enhancement and Uncertainty Modeling Approaches

Speech Visitor Kalle PalomaekiKalle Palomaeki

ICSI & Aalto University, Finland

Thursday, February 5, 2013
12:30 PM, Conference Room 5A

 

 

Abstract

This talk considers noise robust automatic speech recognition activities in the Aalto Speech Group. The two main approaches to noise robust ASR in our team are missing feature reconstruction and sound separation methods. The missing feature method is motivated by the psychoacoustic research according to which human listeners cope with noisy speech relying on clean, noise free spectral temporal glimpses of speech. Here, we reconstruct noisy regions based on models of clean speech statistics. In the sound separation approach noisy speech is modeled as a combination of exemplar spectra of clean speech and noises from respective speech and noise dictionaries. Then speech can be reconstructed from the exemplars that came from the speech dictionary. In both missing feature and sound separation approaches reconstructed speech practically always is inaccurate. That can be dealt with modeling uncertainty due to the reconstruction.

Speaker Bio

Kalle Palomaki is currently visiting ICSI from the Aalto University Finland. He is a senior researcher in the Speech Group in that is currently moving to Signal Processing and Acoustics in Aalto. He has a five-year research fellow position by the Academy of Finland and is leading the noise robust team in the Speech Group .  He had his PhD in 2005 in Aalto, which contained rather divers research topics in computational auditory scene analysis models, noise robust ASR and auditory brain measurements.