ON­LINE: HelRAW: Polina Yordanova 6.4.2020

The Helsinki Research on the Ancient World (HelRAW) is a monthly research seminar. HelRAW is organized by the SpaceLaw project together with Marja Vierros from the Digital Grammar of Greek Documentary Papyri (PapyGreek) project.

6.4.2020 at 17:00

Polina Yordanova (University of Helsinki): Finding One’s Way in the Digital Forest: Discontinuity in a Treebank of Documentary Papyri

Abstract: Word order has traditionally been an underrepresented topic in the research of Ancient Greek, and even those studies that are particularly addressing it are mostly examining literary materials. Documentary sources could provide insight into some unexplored aspects of the development of the language, but, due to their vast number and miscellaneous content, traditional philological methods would necessarily be limited in their research scope.

This is where digital technologies come in as a solution giving scholars the opportunity to draw their conclusions on quantitative data that serves as the basis for qualitative approach. Morphosyntactic annotations of corpora of texts (a.k.a. treebanking) is a method proven to be extremely suitable for linguistic research from all points of view, but it is perhaps the best tool for studying word order in particular, as it allows the researcher to keep track of both the
position of words in the sentence and the syntactic relations between them.

Creating a morphosyntactically-annotated corpus can be a cumbersome and time-consuming process, especially if made by hand, but it is querying treebanks that is the truly problematic endeavor. In order to query my trees for discontinuous structures, I have employed the power of XSLT (eXtensible Stylesheet Language Transformations), which allows me to manipulate the treebanked files and enrich their encoding through additional annotations. I will demonstrate how this approach can be applied on a heterogeneous corpus such as the one assembled by the Digital Grammar of the Documentary Papyri (PapyGreek) project.