Who are you?
I am Okko Räsänen, Associate Professor and Academy Research Fellow at the Unit of Computing Sciences of Tampere University, where I also lead the Speech and Cognition research group. Before moving to Tampere, I worked at the Department of Signal Processing and Acoustics at Aalto University, where I am Docent in Speech Processing.
What is your research topic?
The main topics of my research are the computational modeling of infants’ early language acquisition and the speech that infants hear. Our aim is to understand the principles of information processing that underlie language learning: What sort of transformations and processing steps does the speech signal undergo in the human brain in order to make it possible for the individual to learn how to comprehend it, and how can we build similar language capabilities to artificial intelligence systems? We are interested in what sort of linguistic structures can be acquired in a language-independent and unsupervised manner from speech and from the rest of the sensory information that is available to children. On the other hand, we study the learning mechanisms and presuppositions that must be included in the models in order for the learning to succeed. An interesting question is, what kind of language input and other multisensory information infants are generally able to hear and to perceive during their early language development, and to what extent the acquisition of linguistic structures (e.g., sounds and words) is supported by the amount, quality, and the multisensory nature of the input.
In addition to computational models, we have also developed practical analysis tools for the automated analysis of large child-centered audio data, which can help us to better understand the characteristics of speech heard by children. The data sets typically consist of day-long recordings recorded using wearable microphones in children’s natural acoustic and linguistic environments. For example, in the recently completed international collaboration project Analyzing Child Language Experiences around the World, we analyzed about 14,000 hours of child-centered audio material in order to study children’s early language experiences in various linguistic and cultural settings. Our next goal is to further process our analysis results into publications.
Computational research in language learning is multidisciplinary and interesting work, but on the other hand, it is also challenging. In order to work with speech signals and to model human learning processes, an in-depth command of signal processing and machine learning methods is required. In addition, however, it is important to have a good understanding of phonetics, early language development and the functioning of human cognition, so as to make it possible to reconcile the new models and methods with theory and data from language development research.
In addition to research on language acquisition, my research team develops various analysis methods for speech, e.g., for evaluating the health condition or the emotional state of a given speaker. My group is also involved in the development of smart wearables for babies for the clinical assessment and monitoring of their neurophysiological and motor development (as part of the Academy of Finland’s Health from Science research program). Moreover, I work on many other themes in speech technology, cognitive science, and signal analytics based on machine learning. Often, the signal processing and machine learning methods that are used in speech technology are also well suited for processing a wide variety of time series data.
How is your research related to Kielipankki?
In my research, I have used the FinDialogue corpus that is currently on its way to the Language Bank of Finland, and many other corpora that are provided by the Language Bank are also familiar to me. I am looking forward to the publication of the speech material collected during the Donate Speech campaign for research use. In my opinion, the Language Bank is also a viable publication channel for any new data that we may create during our research in the future.
Publications related to Kielipankki
Khorrami, K. & Räsänen, O. (2021). Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? – A computational investigation. Language Development Research.
Räsänen, O., Seshadri, S., Lavechin, M., Cristia, A., & Casillas, M. (2021). ALICE: An open-source tool for automatic measurement of phoneme, syllable, and word counts from child-centered daylong recordings. Behavior Research Methods, 53, 818–835.
Räsänen, O., Doyle, G., & Frank, M. C. (2018). Pre-linguistic segmentation of speech into syllable-like units. Cognition, 171, 130–150.
Kakouros, S., Salminen, N. & Räsänen, O. (2018). Making predictable unpredictable with style — Behavioral and electrophysiological evidence for the critical role of prosodic expectations in the perception of prominence in speech. Neuropsychologia, 109, 181–199.
Räsänen, O., Kakouros, S. & Soderstrom, M. (2018). Is infant-directed speech interesting because it is surprising? — Linking properties of IDS to statistical learning and attention at the prosodic level. Cognition, 178, 193–206.
Rasilo H. & Räsänen O. (2017). An online model of vowel imitation learning. Speech Communication, 86, 1–23.
Räsänen, O. & Rasilo, H. (2015). A joint model of word segmentation and meaning acquisition through cross-situational learning. Psychological Review, 122(4), 792–829.
More information on the aforementioned resources in Kielipankki
- Donate Speech campaign (during the year 2021, data will be made available for researchers and companies)
- FinDialogue
The FIN-CLARIN consortium consists of a group of Finnish universities along with CSC – IT Center for Science and the Institute for the Languages of Finland (Kotus). FIN-CLARIN helps the researchers in Finland to use, to refine, to preserve and to share their language resources. The Language Bank of Finland is the collection of services that provides the language materials and tools for the research community.