Adaptation of Acoustic and Language Models for Speech That Is Hard for ASR
Mikko Kurimo
ICSI
Tuesday, February 21, 2012
12:30pm
One reason automatic speech recognition is so hard is that speech varies a lot. The variation is due to speaker characteristics, accents, languages, speaking styles and recording conditions. Having specialized statistical models for all speakers and situations would give the best accuracy, but in practice, we often have to apply inaccurate average models. In this talk, I will present my speech group's latest results for adaptive and unsupervised acoustic and language modeling methods, which try to bridge this gap. Relevant applications are in speech recognition, synthesis, retrieval, and translation.
Bio:
Mikko Kurimo is the leader of the Speech Group at the Information and Computer Science Department at Aalto University. After his PhD at the Helsinki University of Technology (which later became part of Aalto) and postdoc at IDIAP, he became a professor and chief research scientist at Aalto. In addition to IDIAP, he has visited and worked with several research centers: University of Colorado at Boulder, University of Edinburgh, SRI, Cambridge University, and Nagoya Institute of Technology. His group at Aalto consists of 12 researchers concentrating on several aspects of speech recognition and funded mainly by projects in Academy of Finland, Tekes, and European Community. Their work is best known for unsupervised morpheme-based language modeling and adaptation of acoustic and language models for various speakers and speaking styles.
In 2012, Mikko Kurimo and his PhD student Seppo Enarvi are visiting ICSI, working on ASR for spontaneous speech.