Who are you?
I am Jack Rueter, a principal investigator in Digital Humanities at the University of Helsinki and a Project Researcher in Finnish and Finno-Ugric Languages at the University of Turku working with contextual disambiguation of corpora, annotated manually and using rule-based systems. At the age of seventeen, I spoke my first words of Finnish, and from there have endeavored to acquire a working knowledge in several other non-English languages.
What is your research topic?
During my studies and subsequent research of Uralic and other minority languages, I have gradually expanded my comprehension of using language-technological tools and practices for the enhancement of fundamental work in linguistics. Although I began my first finite-state description of Komi-Zyrian a quarter of a century ago, which I followed with parallel and corpus work for the Erzya language in the beginning of this millennium, it is the last decade, which has seen ambitious collaboration in the description of languages in several branches of the Uralic language family and beyond. These descriptions have centered in the study of lexica, rich yet regular morphology, syntax and the idea that useful language documentation might be facilitated in the development of tools and learning environments for multilingual application.
My work with the Komi-Zyrian language began while taking a course at the University of Helsinki in the early nineties. Our teacher, E. Cypanov, offered us lessons based on materials he had written in Russian – no Komi-Finnish or Komi-English dictionaries were available at the time, so I undertook the translation of his glossary into a small trilingual Komi-English-Finnish word list, which I was able to proofread and expand with a scholarship from the Alfred Kordelin Foundation. At the time, such word lists were seen as a fundamental point of development for finite-state descriptions, and as such I was able to begin my modeling of a finite-state description for Komi-Zyrian with advice from Professor Kimmo Koskenniemi on a Unix system in 1995.
From 1996 until 2004, I spent a large part of my time among the Komi, the Erzya and the Moksha. During this time, I taught Finnish at the Mordovian State University in Saransk, Mordovia – about 600 kilometers east-southeast of Moscow. There, in addition to language instruction, I began collecting and digitizing Mordvin language literature, learning the two literary languages and developing relations with professional language users and native speakers. These personal contacts have contributed to my knowledge of the languages and provided me with native-language descriptions of the languages, elementary to their adequate documentation. This was also a time to become familiar with other languages spoken in Russia as well as to foster affiliations with language research at the Universities of Turku and Tromsø.
Upon leaving my teaching position in Saransk, I immediately became involved in work with the open-source infrastructure, Giellatekno, in Tromsø. Trond Trosterud and his colleagues were interested in my work with Komi and wanted to include it in the development of their Barents and Circum-polar language-technology development. Needless to say, I acquiesced, and open-source Komi became another piece of the puzzle for extensive dictionary and morphology work in my collaboration from Helsinki, where I began my postgraduate studies. Language technology definitely played a strong role in the categorization of morphological phenomena in the Erzya language, a forerunner to what I documented in my dissertation in 2010 and what I would greatly expand upon in subsequent work funded by the Kone Foundation and in the auspices of its «Language Programme» (2012–2021).
The Language Programme saw the extensive pilots and projects for digitizing endangered materials from the 1920–40s for Finnish kindred languages in
Lexicon and morphology only really make sense if you can apply them to a broader usage – syntax and meaningful usage, for example, translation. Thanks to Anssi Yli-Jyrä, I became involved in the Universal Dependencies project in the late 2010s. It was here that I debuted with a tree bank for
Apertium started out with translation between Catalon and Spanish related language forms. This initially involved conversion of lexicon from source to target, the subsequent transfer of morphological information, and finally an adaptation of the resulting source syntax to target syntax and idioms. The idea of being able to translate between closely related languages on the basis of the shallow transfer of regular morphological categories and information describes a tool that, in addition to facilitating informative reference translation, might also be used in measuring the distance between language forms through documented lexical, morphological and syntactic and idiomatic convertibility. The development of shallow-transfer tools for the triangle
How is your research related to Kielipankki?
At the end of the last millennium, I began collecting Moksha, Erzya and Komi literature with releases from the authors and publishers for compilation and research study in the University of Helsinki Language Corpus Server (UHLCS), which has since been incorporated into the Language Bank of Finland materials at Kielipankki. FIN-CLARIN has provided me with time and resources for validating older UHLCS materials and coaching with work in newer corpora development and educational materials. This has meant that I have had the opportunity to bring my own
Publications
Rueter, J., Partanen, N., Hämäläinen, M., & Trosterud, T. (2021).
Hämäläinen, M., Rueter, J., & Alnajjar, K. (2021).
Rueter, J., Hämäläinen, M., & Partanen, N. (2020).
Hämäläinen, M., Alnajjar, K., Rueter, J., Lehtinen, M., & Partanen, N. (2021).
Rueter, J., Pereira de Freitas, M. F., Facundes, S., Hämäläinen, M., & Partanen, N. (2021).
Rueter, J. (2020).
Rueter, J. (Author), & Axelson, E. (Author). (2020).
Rueter, J., Partanen, N., & Ponomareva, L. (2020).
Rueter, J. M. (2020).
Rueter, J., Partanen, N., & Pirinen, T. A. (2021).
Rueter, J., & Hämäläinen, M. (2020).
Rueter, J. M. (Accepted/In press). Mordva. In R. Valijärvi & D. Abondolo (Eds.), The Uralic Languages Routledge.
More information on resources in Kielipankki
Other resources and repositories
The