Who are you?
My name is Tamás Grósz and I am a Research Fellow in the
What is your research topic?
During my PhD years, my research was focused on Speech Technology, specifically on developing new deep-learning-based solutions for Automatic Speech Recognition (ASR). Although my main interest was acoustic modelling, I was also active in other areas. Paralinguistics, in particular, piqued my interest, and I worked on a wide variety of tasks. I regularly participated in the
After graduation, I joined Mikko Kurimo’s lab as a postdoc, where I had an opportunity to work on other topics, including language modelling and AI explainability. Initially, I worked on subword-based language models for agglutinative languages like Hungarian and Finnish. While working with various models, I noticed the importance of curriculum learning. As a spin-off project, I have started investigating different ways of estimating the difficulties of training samples and constructing new curriculums for AI models.
Simultaneously, working on projects like
In 2022, we developed a system that can recognize different kinds of stuttering (e.g. word/phrase repetition, prolongation, sound repetition and others) and won the INTERSPEECH 2022 Stefan Steidl Computational Paralinguistics Award. Later, we investigated how the emotional state of speakers can be recognized from non-verbal vocal expressions (such as laughter, cries, moans, and screams). Our system achieved first place for both tasks in the ACMMM CompParE competition. Since then, I have also worked on multimodal solutions for Emotion and Humor detection.
My current work mainly focuses on training and understanding Self-Supervised Foundation models as part of our
How is your research related to Kielipankki?
As modern speech recognizers require a considerable amount of data, it became a priority to collect and annotate suitable corpora. In 2020, I joined the team creating the
Currently, I am also involved in the
Recent publications
Getman, Y., Grósz, T., Hiovain-Asikainen, K. & Kurimo, M. (2024),
Karakasidis, G., Kurimo, M., Bell, P. & Grósz, T. (2024),
Moisio, A., Porjazovski, D., Rouhe, A., Getman, Y., Virkkunen, A., AlGhezi, R., Lennes, M., Grósz, T., Linden, K. & Kurimo, M. (2023),
Phan, N., von Zansen, A., Kautonen, M., Grósz, T. & Kurimo, M. (2024),
Virkkunen, A., Sarvas, M., Huang, G., Grósz, T. & Kurimo, M. (2024),
Getman, Y., Phan, N., Al-Ghezi, R., Voskoboinik, E., Singh, M., Grósz, T., Kurimo, M., Salvi, G., Svendsen, T., Strömbergsson, S. et al. (2023),
Grósz, T., Getman, Y., Al-Ghezi, R., Rouhe, A. & Kurimo, M. (2023),
Grósz, T., Virkkunen, A., Porjazovski, D. & Kurimo, M. (2023),
Porjazovski, D., Getman, Y., Grósz, T. & Kurimo, M. (2023),
Corpora
The