Who are you?
I am Filip Ginter and I am an associate professor of language technology at the University of Turku. I am also presently the longest-serving member of the TurkuNLP research group. I am a computer scientist by training, profoundly enjoying the many unique challenges human language poses.
What is the focus of your research?
Not blessed with patience nor long attention span, I have managed to dip into quite many research topics over the years with our TurkuNLP team. We started off with scientific literature mining, but then branched into more general development of various NLP tools and resources. I’ve always had a soft spot for Finnish and chose to contribute especially to Finnish NLP, perhaps to give back to the society which so generously hosted me for my PhD research. My personally most important – or at least most visible – undertaking was the Turku Dependency Treebank, which later on became one of the first treebanks in the super-successful
Recently, I of course could not help but jump on board the deep learning tsunami. TurkuNLP’s previous work on crawling the Finnish Internet and gathering billions of words of Finnish paid off when it became a crucial part of the training corpus of the
And where do I go from here? I see it as my goal to bring to Finnish, one way or another, most of the tools, tasks, and resources that the bigger languages have. Think about question answering, summarization, semantic search, paraphrase models and many other NLP tasks not yet properly covered for Finnish. If they can exist for English, then they should also for Finnish. We are living exciting times in NLP and now we have many more opportunities to make it happen than we had yet five years ago. And of course, with the
Apart from these more or less mainstream NLP projects, I have had several I dare say successful collaborations in the field of digital humanities, in particular with the historians. I enjoyed these projects as they challenged us with interesting technical and algorithmic problems to solve.
How is your research related to Kielipankki?
Perhaps my most visible contribution to the Language Bank is the
Naturally, we have used the Language Bank’s resources extensively here in TurkuNLP, perhaps most of them the
I cannot stress how important it is for Finnish NLP that we all contribute open datasets and free tools and models to the Language Bank and also maintain our edge in terms of computational resources, with LUMI being the perfect example.
Publications
J. Kanerva & F. Ginter & S. Pyysalo 2020.
J. Kanerva & F. Ginter & T. Salakoski 2020.
J. Kanerva & F. Ginter & N. Miekka & A. Leino & T. Salakoski 2018.
A. Vesanto & A. Nivala & T. Salakoski & H. Salmi & F. Ginter 2017.
Tools and corpora (available via Kielipankki)
More information
The