Who are you?
My name is Juho Leinonen and I am completing my PhD studies in the
What is your research topic?
The topic of my Master’s thesis was the automatic speech recognition for Sámi language, and it is possible for me to build on this experience in my PhD work as well. In my current research, regarding chatbots and forced alignment of speech, I still need language models and acoustic models, both of which are also required in automatic speech recognition. In speech recognizers, language models are used for recognizing words that are pronounced in an unclear or ambiguous way, whereas chatbots need language models for generating new text. Language models can also be applied on assessing the quality of text generated by bots. The process becomes circular: in order to evaluate the results in a reliable way, we need to understand what high-quality text is like, but the same understanding is a pre-requisite for generating text in the chatbot. This constitutes a philosophical problem as well as an engineering one.
The goal in traditional speech recognition is to find the text that corresponds to the audio recording as well as possible. When developing a speech recognizer, previously aligned speech data is first required in order to train the acoustic models. Aligning text with speech is actually routine work in speech recognition. However, speech alignment would be a useful functionality for researchers in other fields as well, and it is hardly possible for everyone to become a speech recognition professional before they can get started with their own research. During the past year, I have packaged the speech recognition and alignment tools used in our research group into a toolkit that would be as easy to share as possible. I am also searching for good measures that could be used for assessing the quality of the alignment. My goal is to find out which acoustic models or features produce the best alignment, and in what sort of situations it is possible or worthwhile to use the models trained on major languages for aligning minority languages. This research has also opened up the world of language researchers for me, since I am trying to adapt the tool to suit their purposes as well as possible.
How is your research related to Kielipankki?
On the spur of the moment, I ended up testing the Finnish speech recognizer, developed by our group, for aligning the
For training chatbots, I also use the
Publications related to Kielipankki
Leinonen, J., Smit, P., Virpioja, S., & Kurimo, M. (2017).
Leino, K., Leinonen, J., Singh, M., Virpioja, S., & Kurimo, M. (2020).
Leinonen, J., Virpioja, S., & Kurimo, M. (2021, May).
More information on the aforementioned resources in Kielipankki
The