This event aims to bring together students and researchers of humanities, social sciences and computer science, for a week of active co-operation in groups under the heading of digital humanities.
Digital humanities, as understood here, is about applying modern data processing to solve research questions in the humanities and social sciences. At its best, such close collaboration offers unique benefits for both fields: scholars in the humanities are able to tackle questions too labour-intensive for manual study, while computer scientists encounter new and challenging use cases for the tools and algorithms they develop.
The hackathon consists of intensive work in small groups, formulating research questions with respect to particular data sets, applying and developing methods and tools, and presenting the work at the end of the week. For information on what the hackathon was like in previous years, see: #DHH16 and #DHH15.
During the hackathon, the participants will learn how to work in multidisciplinary research projects. The hackathon will also broaden their understanding of digital humanities, and what is possible to achieve with such collaboration.
We will have 4 groups, each with up to 8 participants + group leaders.
General description: Is the mainstream media a lapdog of the political elite? The goal of this group is, first, to account for the scope and quality of agenda-setting power: How, and to what extent does the political elite (governments, parliaments, politicians) determine (through government initiatives, political processes and their scheduling) what issues and themes are being discussed, how and by whom? Second, the group investigates into the quality and tone of the news reports and other coverage. Potential research questions include: Are the changing relations of political power 2011–2017 reflected in news journalism, and if so, how? Are there variations in how different governments (2011–2014, 2014–2015, 2015–) are monitored and their political agendas covered? Is there a difference of approach between public service and commercial media? Who gets to represent the political elites, whose voices are heard?
What will the computer scientists do: Apply text mining (for example named entity recognition, topic detection and sentiment analysis) to extract data points of interest to the humanists from the articles. Work with the humanists on statistical and time series analyses that highlight interesting patterns in the data.
What will the humanists and social scientists do: Analyze the political programmes of the current and the two past governments, and through close reading create lists of key topics and persons as a basis for statistical analysis. Provide context-sensitive interpretation of the results.
Potential materials: Complete corpora of Finnish political news articles from multiple sources 2011–2017.
General description: This group focuses on the Enlightenment by analyzing how texts written in the period reflect social and political attitudes, as well as frame certain historical phenomena and emerging ideas. The objective is to recognize, in large fulltext corpora covering the period, the most heated social, political and religious debates and the individual books that were creating the heat. At the same time the idea is to create new methodological approaches for recognising such phenomena. The group will also try to analyse attitudes towards particular luxury and other objects that strongly divided opinions during the Enlightenment. The group will be working in cooperation with researchers currently focusing on the same topic and will benefit from work and tools already developed for the datasets.
What will the computer scientists do: Work with the humanists to develop new methodologies for analysing the impact of, and attitudes towards books and phenomena, based on for example referential analysis, computational statistics, text mining and sentiment analysis.
What will the humanists and social scientists do: Analyze the currents in intellectual history reflected in the source material, and through close reading create typologies that will form the basis of statistical analysis. Interpret the developments and changes uncovered by computation methods in the material over the century.
Potential materials: Eighteenth Century Collections Online (full texts of ~50% of all literary works published in 18th century), English Short Title Catalogue (extensive metadata of 18th century English literature).
General description: This group focuses on metatheoretical questions in positioning the possibilities of data mining with regard to traditional humanist methods of interpretation and analysis of individual cases. Particularly, the research group asks what kind of textual clusters (based on genres, themes, and defining factors) can be identified in newspaper data and how this clustering benefits or opens up new questions for the humanities. On a more general level these questions link to theoretical and methodological reflections: if, as in humanities, the researched phenomenon is often already an interpretation (such as a text’s genre, or its ideological undercurrents), what is the added value of quantity for such research? Do a thousand close readings translate into more than the sum of its components?
What will the computer scientists do: Develop methods for style/genre detection, based on the interests of the humanists. Possible methods include for example topic modeling, sentiment analysis, clustering and machine learning ranging from vector space models to deep learning.
What will the humanists and social scientists do: Through close reading, form sets of model texts for machine reading and genre detection. Ask questions as to the usability and implications of text mining for their home disciplines. Beneficial knowledge: Finnish history and historiography, literature and early 20th century history of ideas, understanding the principles of genre theory and/or knowledge on analytical reading as method (and its critique as symptomatic reading), narratology
Potential materials: The Finnish digital newspaper database (–1910) as the main source; the Corpus of Finnish Literature (1880–1910) + the Corpus of Early Modern Finnish (1810–1880) as supplementary sources
General description: This group analyzes the historical changes in what is considered valuable cultural heritage. For instance, medieval castles have been nationally significant since the 19th century, while wooden houses or modern concrete architecture have been considered important only fairly recently. This theme of developing heritage conceptions is, firstly, approached by analyzing the spatial, visual and chronological distribution of institutionally protected cultural heritage in Finland. Secondly, the theme is examined by investigating how the development of the protected cultural environments (composed of built heritage, archaeological heritage, and cultural landscapes) have affected the accumulation of visual (and textual) heritage in museums and archives. How is the emergence of new heritage conceptions reflected in the actual corpus of heritage material? Is it possible to recognize toponymical clusters, or regions in the provenance of digitized museum and archival collections, and correlate them with the development of “nationally valuable” cultural environments?
What will the computer scientists do: Unearth patterns in the data by combining available spatial information with the metadata and context of text and images. Desired expertise includes for example familiarity with REST APIs, geographical information services, structured data processing, statistical analysis, visualization and data mining.
What will the humanists and social scientists do: Analyze the currents in the history of cultural heritage reflected in the sources, and the relations between the historical contexts and the present state of affairs. Provide interpretation of developments and changes uncovered in the material since the 19th century.
Potential materials: Spatial and geographic data concerning the nationally valuable built cultural environments and the nationally valuable landscapes, and datasets in Finna.
19 April 2017 10–13, Athena, Siltavuorenpenger 3 A, room 302
3 May 2017 10–13, Athena, Siltavuorenpenger 3 A, room 107
8 May 2016 10–13, Athena, Siltavuorenpenger 3 A, room 302
Main Hackathon week
15–16 May 2017 at 9-17 at Minerva Square, Siltavuorenpenger 5, room K213
17–18 May 2017 at 9-18 at Athena, Siltavuorenpenger 3 A, room 302
Presentation of DHH17 Hackathon projects
We will have public presentations of the projects on Friday, 19 May 2017 at 13:00–17:00:
The presentations take place at the Anatomical Theatre at Athena, Siltavuorenpenger 3 A, room 107 (map: https://goo.gl/maps/CpML4qjcRzR2)
Everyone interested in the hackathon and digital humanities is welcome to come and listen to the presentations! The event is free, but please register in advance to let us know that you are coming using this form: https://elomake.helsinki.fi/lomakkeet/80222/lomake.html!
The hackathon is aimed mainly for MA students and beyond. As a course, it is part of the 25 credit digital humanities module (see here). Computer science and other students with sufficient programming skills (ask the coordinating teacher if you think you have sufficient skills) may join without prior digital humanities studies. For students in the humanities, if the event should be overbooked, priority is given to those who have completed introduction to digital humanities and/or introduction to methods in digital humanities.
register using this link: goo.gl/kiRqKR
3–5 ECTS credits may be gained from participating in the hackathon. Assessment: pass/fail, based on participation in the group work, presentation, and an individual report.
Students from the University of Helsinki: please contact Mikko Tolonen for more details on the credits.
Students from Aalto University: please contact Jukka Suomela for more details on the credits.
Mikko Tolonen, Eetu Mäkelä & Jukka Suomela
Anu Koivunen, Emily Öhman
Ville Vaara, Jukka Suomela, Antti Kanner
Ilona Pikkanen, Elsi Hyttinen, Risto Turunen
Visa Immonen, Johanna Enqvist