The Discovery Research Group consists of members with versatile backgrounds and interests. Read more about profiles of individual group members below.
Prof. Hannu Toivonen works in the areas of artificial intelligence and data science, more specifically in computational creativity and data mining. He has introduced and solved several novel research problems in the area of data mining since the field was born in early 1990s; his definitions and algorithms have become standard references and textbook material in the field. He has since developed applications of data mining for gene mapping, context-aware computation, document analysis and summarisation, and computational creativity.
The current research focus of Hannu Toivonen is on using data science for computational creativity, on self-aware and creative systems, and on analysis and generation of natural language. Hannu recently served as Programme Chair of IEEE ICDM 2014, a leading international data mining conference, and of ICCC 2015, the international conference on computational creativity.
Mark's research interests lie in natural language processing (NLP), computational creativity and music processing/cognition. In all of these areas, he applies recent Machine Learning techniques to a variety of largely language-related how.
He is a postdoctoral researcher working primarily on two projects. Digital Language Typology investigates methods for automatically discovering family relationships between languages from text and speech material, with little or no prior linguistic analysis or resources. It focuses on low-resourced Uralic languages. Immersive Automation concerns building tools for the newsroom of the future, in particular using Natural Language Generation to automatically generate news articles from data.
Lidia's research interests lie in natural language processing, more specifically in advanced methods for information extraction and media monitoring. Her favourite research objects have been large collections of newspaper articles. Lidia has participated in a number of cross-disciplinary research projects that included collaboration with linguists, sociologists, journalists and other domain experts. Currently she is working on two media-related project: NewsEye, aimed at support of historical newspapers studies, and Embeddia, aimed at automatic news generation for under-presented languages.
Simo's doctoral studies consider the intersection of computational creativity, autonomous agents and multi-agent systems. From a single agent perspective he is interested in how autonomous and self-adaptive agents can exhibit creativity both in their outputs and in their internal processes. In multi-agent settings his main focus is on how a group of creative agents can work together in novel ways to accomplish tasks that are not easily fulfilled by any single agent alone.
Leo's current research interests lie in the fields of Natural Language Generation and Data Science, as well as their applications to different domains, especially automated journalism and automated report generation. He is currently working on the Immersive Automation research project to enable the automated production of engaging, data driven news content. Previously, Leo has worked on Learning Analytics, Educational Data Mining and Computer Science Education, fields which still remain close to his heart.
Eliel’s research interests lie in machine learning, data mining, and natural language processing. He is working on the Digital Language Typology project concerned with the computational discovery of structural relationships between languages, in terms of various typological dimensions. The project is focused on low-resourced languages, calling for language-independent methods applicable to unannotated data.
Sardana is interested in low-resource languages revitalization. She works on development of various tools and resources for low-resource languages. Sardana is currently working on transfer learning to utilize resources from a more resourceful language for creation of digital tools and methods for less-resourced languages. Sardana is specifically interested in Turkic and Finno-Ugric low-resource languages.
Research assistants in the group:
- currently none
- Dr. Floris Geerts, postdoc (9/2002-4/2004)
- Dr. Bart Goethals, postdoc (1/2003-9/2004)
- Dr. Päivi Onkamo, postdoc (11/2002-12/2004)
- Dr. Sebastien Mahler, postdoc (2/2009-7/2010)
- Dr. Fang Zhou, postdoc (9-12/2012)
- Dr. Alessandro Valitutti, postdoc (8/2011-12/2013)
- Dr. Tommi Opas, external member, serial entrepeneur (2013)
- Dr. Ping Xiao, postdoc (1/2014-9/2016)
- Dr. Sirpa Riihiaho, postdoc (6/2017-2/2018)
- Dr. Myriam Munezero, postdoc (1/2017-10/2018)
- Dr. Hadaytullah Kundi, postdoc (8/2017-2/2019)
- Dr. Kari Vasko, PhD 2004
- Dr. Petteri Sevon, PhD 2004
- Dr. Mika Raento, PhD 2007
- Dr. Kimmo Hätönen, PhD 2009
- Dr. Kari Laasonen, PhD 2009
- Dr. Petteri Hintsanen, PhD 2011
- Dr. Fang Zhou, PhD 2012
- Dr. Mika Timonen, PhD 2013
- Dr. Lauri Eronen, PhD 2013
- Dr. Esther Galbrun, PhD 2014
- Dr. Joonas Paalasmaa, PhD 2014
- Dr. Laura Langohr, PhD 2014
- Dr. Oskar Gross, PhD 2016
- Dr. Jukka Toivanen, PhD 2016
- Dr. Anna Kantosalo, PhD 2019
Depending on the funding situation, we often have vacant positions for postdocs. Currently, we are especially looking for postdocs with background in computer science or language technology, with interests in topics such as natural language processing, text mining, data science, natural language generation, computational creativity. NB: Applications for student internships are only taken via the joint call of the department, usually in Jan-Feb. New PhD students are taken only in exceptional cases.
If you are interested in joining the research group, please contact Prof. Hannu Toivonen. Regardless of the nature of the position you are looking for, please
- read two or three of our recent articles that look most interesting to you
- explain in your cover letter what you found most interesting in those articles
- also explain what topics you would like to do work on
- attach a copy of your study transcript in English (an inofficial copy is ok)
Due to the large number of applications, we only reply to messages that follow the above instructions.