Team 1 engages with digital analysis of linguistic corpora and social network analysis, supported by traditional methods of textual analysis. It harnesses recent digital humanities tools to answer ‘big data’ questions. Language technology-related research has not previously been pursued in Assyriology, with a few isolated exceptions. A social network analysis approach has been used as an analytical tool, but has not yet been automated on the corpus- level.
Team 1 uses and develops state-of-the-art methods from the field of language technology to handle masses of ancient textual data. These electronic corpora can be processed by computational means (e.g. clustering and machine learning) to generate semantic domains for lexemes relating to social group identities. The lexemes to be studied will be chosen by Assyriologists while the language technology researchers will adapt methods from their field to generate automated semantic networks. The team also includes scholars with competence both in Assyriology and in digital humanities. Working closely together, the philological experts will guide the development of complex language technology algorithms and their applications. Building contextual semantic models from primary sources offers novel possibilities for understanding first millennium BCE Mesopotamia from an emic perspective. Instead of relying on existing definitions, Team 1 starts from a corpus perspective. For example, the Akkadian word nakru often means ‘foreign, enemy’ but its semantic domain has not been thoroughly researched, because it would be impossibly slow without the automated methods of language technology. By identifying semantic domains researchers can attempt to determine which concepts (or which domains) were important for the people writing Akkadian. Such a quantitative perspective also greatly broadens the possibilities for semantic and linguistic research on these text corpora by other linguists and historians. The work on semantic domains is already underway in Helsinki. The main co-operation partner and data provider is the the Open Richly Annotated Cuneiform Corpus (ORACC) and the Helsinki-based electronic Neo-Assyrian Text Corpus (NATC). Team 1 plans to expand these approaches to text corpora from later periods in the future. Furthermore, other corpus linguistics approaches will be explored, in addition to the goal of building semantic domains automatically.
Lexemes do not exhaust access to social group identities, since these methods exclude non-lexicalised concepts (e.g., no single Akkadian lexeme represents the English word ‘people’). Thus Team 1 also utilizes social network analysis. Tracing social networks through prosopographic data enables Team 1 to delineate social groups among the elite of the empires under study. Supported by traditional philological and historical work on the texts, Team 1 covers an emic point of view for each period corpora.
Team 1 will specifically:
Work package 1 concentrates on the core social group identities of these empires. Team 1 will contribute expecially to the emic view, analysing how elites viewed themselves. To investigate imperial identities, Team 1 members use language technology, social network analysis, and traditional archival work. Depending on the limitations of the material, these approaches will be carried out for all chronological periods under the study in ANEE.
Work package 2 concentrates on marginal and marginalizing regions. Empires cannot exist without marginal regions, therefore understanding them is essential for imperial dynamics. This work package will compare marginal areas and former centres that became marginal, to explore how these local elites interacted with imperial systems differently from those in more central regions (WP1). For Team 1, the schedule of work operates in parallel to the WP1 schedule. Identities within different areas of the empires are examined from the point of view of language technology, social network analysis, and traditional archival work. Marginal identities during the Neo-Babylonian period must be studied via close-reading of texts, due to the lack of large archives from outside Babylonia. A study of marginal identities during the Persian and Hellenistic period will mostly be based on texts from Babylonia (examined through both lexeme studies and social network analysis) as a marginalizing region. The study of Hellenistic and early Roman identities in the Near East focuses on local identities in flux. Here the traditional, text-based historical research is necessary as the archival evidence is scarce.
Work package 4 synthesises the results of the previous three WPs into a holistic view more useful to ANEE’s stakeholders. WPs 1–3 establish aspects of social group identities and lifeways in the urban centre and in rural margins. Yet, properly assessing the meaning of studies done in WPs 1-3 requires additional analysis that can explicitly interrelate the multiple methodologies employed. Team 1 will integrate the results of its language technological inquiries, social network analysis work, and traditional text-based research to synthesize emic views on social group identities and lifeways and the way they change in the geographic shifts of power. In practice, this includes overview publications (mostly co-authored) that cover changes in semantic fields and social group dynamics over longue durée, and a methodological handbook on digital approaches to ancient texts. Thus, this stage of work requires close co-operation on a number of synthetic publications.