Through the Language Bank of Finland, anyone can study the letters of author Aleksis Kivi, read old newspapers from the collections of the National Library of Finland, browse through discussions on the popular Suomi24 forum or revisit new year’s speeches given by the President of Finland.
The Language Bank is a service which compiles text and speech corpora for researchers of several different fields. Many of its resources are public and available to anyone. The Language Bank also has material in languages other than Finnish, as well as multilingual translation corpora.
Certain Language Bank resources may be restricted due to copyright or data protection reasons. Some of them can be accessed with a University of Helsinki login. Access rights to materials with tighter restrictions can be requested through a handy online service.
Text and speech are not the only types of language resources in the Language Bank, as it also contains sign language material.
– There is an upward trend in sign language resources, explains Mietta Lennes, project planner.
Through cooperation with the Finnish Association of the Deaf, the Language Bank has acquired Finland’s language policy programme for sign languages – in sign language, of course!
Literature and discussion forums
Thanks to the searchable corpus of the Suomi24 discussion forum, researchers can, for example, study the informal use of words originating in literature. What kinds of expressions could such a search cover? Mietta Lennes provides some examples to illustrate the way the search works:
– The word jästi, Jaana Kapari-Jatta’s Finnish translation of muggle from J.K. Rowling’s Harry Potter’s books, yields many hits from the entertainment and culture section of the forum, but also in the sections on pets and relationships.
Most of the corpora available through the Language Bank’s Korp search engine are grammatically analysed, meaning that a search for “jästi” will yield all inflections of word, including different case endings.
The Korp service can also reveal that “Kyllä minä niin mieleni pahoitin”, the catchphrase of the protagonist in author Tuomas Kyrö’s Mielensäpahoittaja books, became widely used on the forum after the premiere of the first film adaptation of the books.
Dialect researchers can find different dialect variations of standard variety words, for example, the Finnish word for “forest”, metsä, often becomes mehtä in eastern dialects and mettä or messä in western ones.
Recording language and culture through cooperation
The Language Bank is a service provided by the FIN-CLARIN consortium, which comprises Finnish universities and research organisations. The consortium serves as a support network for the Language Bank. The University of Helsinki is responsible for the acquisition of material, tools and educational activities. The CSC – IT Centre for Science, a member of the FIN-CLARIN consortium, is responsible for the Language Bank's technical maintenance.
– FIN-CLARIN aims to ensure that its resources are securely stored and accessible. At the same time, we are helping researchers receive the credit they deserve for compiling and sharing these language corpora, Mietta Lennes explains.
All researchers and research groups working in Finland can offer their language corpora to FIN-CLARIN to make them accessible to other researchers.
"FIN-CLARIN also has cultural historical significance as a preserver of language and culture."
– Through the Language Bank, we can offer researchers a selection of tools to process and study our resources, including search functions, automatic analyses of the resources and other processing. FIN-CLARIN also has cultural historical significance as a preserver of language and culture.
FIN-CLARIN is part of the international CLARIN ERIC research infrastructure, which includes 18 countries. The Language Bank held its 20th anniversary seminar between 9 and 10 June.