#DHH24 Themes

Helsinki Di­gital Hu­man­it­ies Hack­a­thon #DH­H24 will have four thematic areas of interest with one or more groups per topic, each under the auspices of the group leaders.
Eurovision Song Contest

Group leaders: Antti Kanner, David Rosson

Eurovision Song Contest (ESC) is one of the biggest annual televised media events globally, with roughly 180 million viewers tuning in mid-May. A mixture of competition, entertainment and spectacle, it pitches competing countries against each other with an intricate point system based on a combination of per-country popular and professional jury votes. As the EBU (European Broadcasting Union) shares the scoring and voting data freely, it makes a good target for analysis using a variety of Digital Humanities methods and approaches. Furthermore, combining the voting data with other external data sources (such as full song metadata repositories, lyrics databases, streaming service APIs to name just a few) allows enriching the analyses with a plethora of interesting perspectives and questions.

As the ESC brings the continent (and Australia) together, it forms a kind of melting-pot of not only local and global popular cultures, but also debates on the state of culture and politics in Europe. In no particular order, the frequent perspectives in debates have involved the interactions between English and national languages, international networks of production professional contributing to large number of songs simultaneously representing different countries, LGBTQ+ issues, political boycott calls, un-orchestrated shows of solidarity for countries in crisis and voting behaviour of regions with assumed mutual loyalties inside the continent seen as bordering the limits of sportsmanship. The Eurovision hackathon group’s precise foci of interest will follow the combination of skills and backgrounds represented in the group and discovered by the group collaboratively. We aim to merge creativity and out-of-box thinking with the rigour of well-thought-out and defined research questions supported by a diverse toolbox of field-tested techniques and methods, and output-oriented planning and execution.

Because the research question will be workshopped and shaped by the group, it is open to participants with different backgrounds. As the ESC touches upon cultures and societies in such a wide spectrum, many fields within humanities can find relevance for their expertise.  Equally, as a multi-modal spectacle, the ESC will be an apt topic for people with a computation or data analysis background, ranging from data science to the wild frontiers of machine learning applied to video and sound.

Further reading

Kumpulainen, I., Praks, E., Korhonen, T., Ni, A., Rissanen, V., Vankka, J. (2020). Predicting Eurovision Song Contest Results Using Sentiment Analysis. In: Filchenkov, A., Kauttonen, J., Pivovarova, L. (eds) Artificial Intelligence and Natural Language. AINL 2020. Communications in Computer and Information Science, vol 1292. Springer, Cham. https://doi.org/10.1007/978-3-030-59082-6_7

Baker, C., Atkinson, D., Grabher, B. and Howcroft, M.  (2024) Culture, Place and Partnership: The Cultural Relations of Eurovision 2023. Documentation. British Council. https://eprints.gla.ac.uk/317382/2/317382.pdf

Ariely, G., & Zahavi, H. (2022). The influence of the 2019 Eurovision Song Contest on national identity: evidence from Israel. National Identities, 24(4), 359–376. https://doi.org/10.1080/14608944.2021.2013187

Honcharova, O. O., & Kovalchuk, I. V. (2021). Structural-Semantic features of Eurovision Song Contest slogans. Lviv Philological Journal, 10, 25–32. https://doi.org/10.32447/2663-340X-2021-10.4

Enlightening Illustrations: Analyzing the Role of Images in Enlightenment-Era Luxury Books

Group leaders: Kira Hinderks, Ari Vesalainen & Iiro Tiihonen

During the Enlightenment, illustrations played a crucial role as a means to explain and communicate topics to a broader audience. However, the use of illustrations varied by genre and status of books: from crude and repeated wood-engravings of cheap pamphlets to detailed descriptions of plants and animals in hefty scientific publications. This group aims to explore how illustrations of books were related to their other characteristics like genre, price or physical size.

Novelty, size and quality of illustrations were related to various considerations that those producing them had in mind. The work might have needed (certain kind of) illustrations to convey information (detailed portrayal of plants in  a scientific publication), to entertain in a cost-efficient manner (repeatedly used 'stock illustrations'  in a cheap pamphlet) or to impress (a detailed portrait of the author in an expensive luxury edition of his work). Because the illustrations were not used at random, it is reasonable to ask if there were large-scale patterns in their use. For example, do larger and more expensive books usually have more and better pictures? We could also look at how the ways of making pictures changed over the 18th century, and how these changes enabled different strategies of illustration for works intended for different audiences or genres. Or we can focus on some other aspect of illustration-use, depending on the group's interests. This work will help us see books not just as items to read, but as objects that tell stories about science, culture, and money. 

The data for this investigation will be sourced from the Eighteenth Century Collections Online (ECCO), which encompasses over 200,000 volumes of printed works from the 18th century. This digital collection offers a comprehensive window into the 18th century's printed output, covering genres from scientific books to literature and philosophical works. The availability of metadata within ECCO allows for the precise identification of publications across different genres, their physical dimensions, and potentially even their pricing or intended audience. Coupled with the textual content and illustrations, this metadata provides a rich dataset for analyzing the visual and material culture of the period. By employing advanced digital humanities techniques, such as text mining and image processing, participants will be able to identify the relationship between a book's physical and visual attributes and its status as a luxury item.

Further reading

Dutta, Abhishek, Giles Bergel, and Andrew Zisserman. ‘Visual Analysis of Chapbooks Printed in Scotland’. In The 6th International Workshop on Historical Document Imaging and Processing, 67–72. HIP ’21. https://doi.org/10.1145/3476887.3476893.

Ford, Brian J. (2003): Scientific Illustration in the Eighteenth Century. https://www.researchgate.net/publication/228484639_Scientific_Illustration_in_the_Eighteenth_Century 

Hume, Robert. The Value of Money in Eighteenth-Century England: Incomes, Prices, Buying Power—and Some Problems in Cultural Economics, Huntington Library Quarterly 77 (4) (2014) 373–416.

Nickelsen, Kärin. (2006): Draughtsmen, Botanists and Nature: Constructing Eighteenth-Century Botanical Illustrations. https://www.sciencedirect.com/science/article/pii/S1369848605000956?via%3Dihub 

Puig-Samper Mulero, Miguel Ángel (2020): Illustrators of the New World: The Image in the Spanish Scientific Expeditions of the Enlightenment. https://brewminate.com/illustrators-of-the-new-world-the-image-in-the-spanish-scientific-expeditions-of-the-enlightenment/

Raven, James. The Cambridge History of the Book in Britain, Vol. 5, Cambridge University Press, 2009, Ch. Book as a Commodity, p. 83–117.

Sher, Richard. The Enlightenment and the Book: Scottish Authors and Their Publishers in Eighteenth-Century Britain, Ireland, and America, The University of Chicago Press, 2006.

Rudwick, Martin (2005): Picturing Nature in the Age of Enlightenment. https://www.jstor.org/stable/4598936 

Strien, Daniel van, Kaspar Beelen, Melvin Wevers, Thomas Smits, and Katherine McDonough. ‘Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1)’. Programming Historian, 17 August 2022. https://programminghistorian.org/en/lessons/computer-vision-deep-learning-pt1

Echoes of the Chambers: Studying Democracy through Parliamentary Speeches

Group Leaders: Hugo Bonin, Jani Marjanen & Risto Turunen

Democracy is one of the most contested and consensual concepts of our times. While scholars might study it through its theoretical models (Held, 2006 [1987]) or surveys on citizen’s ideas of democracy (Pilet et al., 2022), this group uses text mining to analyze parliamentary discourses on democracy in the 21st century. Parliamentary speeches can be seen as “analytical nexuses'' (Ihalainen and Saarinen, 2019) where different key political concepts meet and challenge each other. In recent years, parliamentary speeches have been transformed into machine-readable data which enables studying democracy on a level of depth and breadth not achievable with traditional humanities methods alone. Our primary dataset (ParlaMint 4.0) includes parliamentary speeches from 29 European countries. Based on the computational and humanities expertise available in the group, we aim at making ambitious transnational comparisons between different conceptualizations of democracy in Europe. More specifically, we study how the idea of democracy mutates over time and in different contexts. Depending on the interests of the participants, we might explore topics such as the rhetorical justifications for democracy, or how democracy is being used to justify other things, or how democracy relates to other concepts such as liberalism or climate change. By systematically comparing the similarities and differences in parliamentary discourses on democracy, especially in relation to metadata about the speakers, authors and parties, we expect to make genuine contributions to the understanding of democracy.

This group is ideal for anyone interested in using text mining and large-scale data to answer core questions of the humanities (Guldi, 2023). Our methodology combines easy-to-use computational approaches with more advanced techniques based on new language models. SSH scholars with domain expertise but without broad technical skills can analyze parliamentary speeches through an intuitive interface (NoSketch Engine). The functionalities include searching for specific terms (e.g., “democracy”), counting the linguistic context in which the terms of interest appear, and measuring the lexical differences between speeches. We also apply language models to extract more nuanced semantic information from the speeches. The models based on the transformer architecture have already led to breakthroughs in various natural language processing tasks, but their value in the humanities and social sciences has not been properly tested, apart from a few pioneering experiments (Rastas et al., 2022).

Computational tasks can include but are not limited to:

  • representing multidimensional data (parliamentary speeches) as dense embeddings.

  • comparing the similarities / differences of individual politicians, genders, parties, and parliaments.

  • analyzing and visualizing time-series data related to the conceptualizations of democracy.

Humanities and social-science tasks include but are not limited to

  • discovering research questions related to democracy in parliamentary discourses.

  • inventing meaningful units of interest that can be measured computationally and validating the results by close reading parliamentary speeches.

  • connecting the results to prior scholarship and refining elementary quantitative information into proper arguments.


Guldi, Jo, 2023. Dangerous Art of Text Mining. A Methodology for Digital History. Cambridge University Press.

Held, David, 2006 [1987]. Models of Democracy. 3rd ed. Stanford, CA: Stanford University Press.

Ihalainen, Pasi, and Taina Saarinen, 2019. ‘Integrating a Nexus: The History of Political Discourse and Language Policy Research’. Rethinking History 23: 500–519.

Pilet, Jean-Benoit, Damien Bol, Davide Vittori, et Emilien Paulis, 2022. “Public Support for Deliberative Citizens’ Assemblies Selected through Sortition: Evidence from 15 Countries”. European Journal of Political Research.

Rastas, I., Ryan, Y., Tiihonen, I., Qaraei, M., Repo, L., -Babbar, R., Mäkelä, E., Tolonen, M., & Ginter F., 2022. ”Explainable Publication Year Prediction of Eighteenth Century Texts with the BERT Model”. In Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change, 68–77, Dublin, Ireland. Association for Computational Linguistics.

Cultures of online discourse

Group leaders: Eetu Mäkelä, Ümit Bedretdin

What does civil discourse look like and where can it be found on the Internet? Using large samples of discussions from different online fora, this group seeks to understand what features mark high-quality discussions, and tries to develop means by which healthy communities and individual nuggets of quality discussion could be identified from huge datasets.

As an anchor, the group will use the /r/ChangeMyView community on Reddit. The purpose of the community is for people who hold a particular view to challenge themselves and their rationales by submitting to interlocutions by others. Due to the rules of engagement in the community, it has been shown to constantly engender respectful and impactful discourse. Based on prior work (e.g. Tan et al 2016, Musi 2018), as well as further qualitative and quantitative study by the group itself, we will try to extract content and structural clues from these discussions, going on then to test these on diverse samples of other online discussions. 

As a second point of entry, we will explore how Redditors try to intervene and dissolve potential conflicts. Starting with phrases/units identified by previous research as such interventions (e.g. "stop arguing"), we will chart their use, and the characteristics of what happens before and after in the conversations where such interventions are invoked. 

Further reading:

  • Chenhao Tan, Vlad Niculae, Cristian Danescu-Niculescu-Mizil, and Lillian Lee. 2016. Winning Arguments: Interaction Dynamics and Persuasion Strategies in Good-faith Online Discussions. In Proceedings of the 25th International Conference on World Wide Web (WWW '16). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 613–624. https://doi.org/10.1145/2872427.2883081 
  • Dayter, Daria, & Messerli, Thomas C. (2022). Persuasive language and features of formality on the r/ChangeMyView subreddit. Internet Pragmatics 5(1), 165–95.https://doi.org/10.1075/ip.00072.day
  • Xiao, L., Mensah, H. (2022). How Does the Thread Level of a Comment Affect its Perceived Persuasiveness? A Reddit Study. In: Arai, K. (eds) Intelligent Computing. SAI 2022. Lecture Notes in Networks and Systems, vol 507. Springer, Cham. https://doi.org/10.1007/978-3-031-10464-0_55
  • Humphrey Mensah, Lu Xiao, and Sucheta Soundarajan. 2019. Characterizing Susceptible Users on Reddit's ChangeMyView. In Proceedings of the 10th International Conference on Social Media and Society (SMSociety '19). Association for Computing Machinery, New York, NY, USA, 102–107. https://doi.org/10.1145/3328529.3328550
  • Monti, C., Aiello, L.M., De Francisci Morales, G. et al. The language of opinion change on social media under the lens of communicative action. Sci Rep 12, 17920 (2022). https://doi.org/10.1038/s41598-022-21720-4
  • Tanskanen, S-K. (2021). ”Stop arguing”: Interventions as metapragmatic acts in discussion forum interaction. In M. Johansson, S-K. Tanskanen, & J. Chovanec (Eds.),Analyzing Digital Discourses: Between Convergence and Controversy (pp. 219-244). Palgrave Macmillan. https://doi.org/10.1007/978-3-030-84602-2_9