Introduction to methods for digital humanities

This page surfaces the material for the "Introduction to methods for digital humanities" course taught at the University of Helsinki (note the definition of our digital humanities). The intention is for this material to eventually be usable for complete self study in addition to contact teaching. We're not there yet though, so at present you the reader are responsible for sifting out the bits that are already usable without an accompanying lecture.

For any enquiries, please contact

Together with its sister course, this course acts as an introductory signposting course to our digital humanities. While the sister course, Introduction to digital humanities, lays out the landscape of all the different digital humanities and charts our place there, this course is aimed at mapping the landscape within that definition. The course thus provides students with the knowledge they need to choose their own focus within computational humanities, also manifesting in the ability to choose from the optional courses in the DH module.

After this course the student understands the multiple ways in which methods benefit work within the digital humanities. She herself is able to use simple tools to work with data. In addition, she has attained knowledge of the fundamental concepts of programming, through which she can start to expand her capabilities, should she so choose. She also learns how open, reproducible research and publishing is done in practice. Further, the student gains a general literacy on advanced computer science methods applicable to digital humanities, and when to apply them. Finally, she learns to apply all of the above in practice in a small concrete digital humanities project.

Prerequisites: absolutely none

To pass the course, you are required to demonstrate some grasp of actual digital humanities work. Therefore, you are tasked with taking some dataset, and processing it in some way to yield an interesting analysis. 

Potential datasets/APIs are for example:

Tools for processing and analysis are for example:

To return the assignment, you will need to upload your data, code and results into a GitHub repository, link that repository with Zenodo and give us the Zenodo DOI for your work. Include a description of what you've done, following as best as possible the guidelines for open, reproducible research:

  1. which data did you use
  2. what did you do to it (and how can I reproduce it)
  3. what do the end results show
  4. how would you continue the work (towards academic meaning)

Further info: your work doesn't need to be a full-blown pipeline from raw data to interesting results. It can be just some steps towards that. However, if you don't have end results, you need to very explicitly describe what your next steps would be to get those (i.e. a plan for future research). 

To further aid you in your work, here are some previous submissions for inspiration (for most of them, you should actually click the GitHub link on the right to start to make sense of them): 

  • themes in Hungarian folk love songs - DOI: 10.5281/zenodo.44570
  • extracting and visualizing biographical information from an old bank matricle - DOI: 10.5281/zenodo.225890
  • analysis of a survey on user involvement in software development - DOI: 10.5281/zenodo.237727
  • polite vs casual address form use by Finnish language learners in different situations - DOI: 10.5281/zenodo.218844
  • discovering patterns in chalcolithic and early bronze age burials in northeast England- DOI: 10.5281/zenodo.215932
  • themes discussed in Helsingin Sanomat in 1905 - DOI: 10.5281/zenodo.44572
  • differences in use between the words maahanmuuttaja and pakolainen in Finnish newspapers 1970- to present - DOI: 10.5281/zenodo.44544
  • differences in how frequently Finnish and Swedish newspapers talk about the Romani people - DOI: 10.5281/zenodo.44590
  • contrasting Beck's lyrics to blues lyrics - DOI: 10.5281/zenodo.215292
  • extracting and analysing recipe information in an old cookbook - DOI: 10.5281/zenodo.216232
  • a thematic analysis of the discussion around Guggenheim on the Suomi24 forum - DOI: 10.5281/zenodo.217719
  • differences in language between texts dealing with altered states of mind and normal fiction - DOI: 10.5281/zenodo.230676
  • preliminary analysis comparing different Finnish cabinet strategies against each other - DOI: 10.5281/zenodo.216604
  • preliminary analysis of patterns in the holdings of the Finnish National Gallery - DOI: 10.5281/zenodo.218735

During the course, you will be given material to read before proceeding to the next part. Typically, these will be academic papers that make use of digital humanities methods. When reading such papers, we ask that you focus on at least:

  • Research questions - What are the humanities research questions? Does the project also target computer science research questions? If so, what? What is the relationship between the CS an humanities research questions?
  • Data - How has the data used been gathered? What are the data sources used? How has the data been processed? Is the data available for others to use?
  • Methods - What methods does the project apply? How do the methods support answering the research questions?
  • Partners - What is the make-up of the project? Which disciplines are represented by the participants?

Provisional core reading list:

Here I'll gather some relevant links to further resources. I think these are good, but they're also somewhat of a random selection.