Organised for the third time at the University of Helsinki last May, the Helsinki Digital Humanities Hackathon invited participants to take a swing at research problems from the humanities and social sciences using methods from computer science. The Hackathon concept is that small groups work together for a short, intense period of time to analyse data, set research questions and find solutions. This time there were five Hackathon groups, and participating students were free to choose the group they were most interested in.
The group Helsinki Geotagged Social Media was presented with a vast set of social media data about Helsinki, which has already been used by researchers in previous studies. The selection featured social media posts geotagged in Helsinki between 2014 and 2018: 1.3 million Instagram posts, 61,000 tweets and nearly 130,000 Flickr posts.
The group supervisors, Tuomo Hiippala, assistant professor in the English language and digital humanities, and Tuuli Toivonen, associate professor in geoinformatics, were used to working with huge social media datasets in their research. They have previously analysed geotagged social media posts, for example to determine the number of visitors to national parks.
“It’s easiest to analyse a massive social media dataset like this when you have a multidisciplinary team,” says Hiippala.
“The work requires geoinformatics, human geography, linguistic technology, machine learning, communications, linguistics and urban research, just to name a few. The issue of research ethics is inevitable, as the research deals with content generated by individuls and their personal data.”
Saara Suominen, who studies English philology at the University of Helsinki, was the only humanities student in the group. Qazi Firas, an exchange student from India, has a background in architecture, but he also learned to use the software needed to draw maps for the project. The data experts in the team were Anton Matveev and Iuliia Kim, who came to the Hackathon from Russia’s ITMO University, along with Sid Rao, a doctoral candidate at Aalto University.
Instagram posts are emotional, and therefore good material for sentiment analysis
The work began with the group poring over a map on their computer screens, with every Instagram post geotagged in Helsinki marked with a dot, and all text posts relating to the images in a giant database table. In addition to each text, the table featured precise information about where and when the post had been made and by whom.
The beginning. The map shows Instagram posts geotagged in Helsinki between 2014 and 2016. Each dot on the map represents one Instagram post.
The group limited the scope of their work so that from the massive dataset, they selected just under 200,000 English-language Instagram posts geotagged in Helsinki between 2014 and 2016.
The team analysed the language, the text and hashtags that users had added to their Instagram photos. They also included emojis in their analysis.
The students decided to conduct sentiment analysis and use it to discover how moods vary around Helsinki. Instagram posts in particular invite such questions, as they are typically quite emotional. The dataset would have also featured tweets and Flickr posts. However, Twitter focuses on public discussion, while Flickr is mainly the domain of photography enthusiasts and professionals.
“People use Instagram to share their experiences, which means that it offers the best perspective to the ways its users operate and think,” says Hiippala.
At first, the group debated whether the analysis would be skewed by the fact that Instagram posts tend to be happy, with an emphasis on positivity.
As they delved deeper into their material, however, they also discovered more sombre themes.
“We decided to look for both the happiest and the saddest places in Helsinki,” explained Qazi Firas when the Hackathon was about halfway through.
The analysis of emotions was very crude, as the AI algorithm they used could only conduct sentiment analysis in terms of positive, negative and neutral. In reality, the scale of human emotion is of course much broader than that, as are the social media posts that arise from it.
Largest number of posts in winter, best mood in late summer and during the holidays
Around the middle of the Hackathon, the group had finalised their research questions: Instead of pinpointing the single happiest place in Helsinki, the group decided to determine how moods vary from one part of town to the next on a scale of positive/neutral/negative. Another target of analysis was the impact of the time of year on the number and mood of the posts.
Just a few days after setting the research questions, the group was ready to present their results at the final event of the Hackathon.
They had found out that most Instagram posts in Helsinki are made in the winter, and that the most positive feelings occur in late summer and during the holidays, according to the posts.
The results surprised the group. One might think that the happiest posts would be in the summer, when Helsinki is warm and sunny. But even during the darkest time of the year, people can find happiness in the sleet:
“Many of the posts geotagged in Helsinki are taken indoors,” points out Hiippala.
Such photos with happy text posts may be of a glass of champagne at a Christmas party, or of a group of friends in candlelight. On the other hand, Finns are known for their love of melancholia – perhaps people want to communicate their happiness by posting a photo of a solitary walk down dark, quiet streets.
The Insta-happiness distribution in Helsinki by district
And what about the happiest and saddest places? This is what Helsinki looks like based on the group’s results:
The map depicts the most prominent moods based on Instagram posts by Helsinki district. Red means a negative, yellow neutral and green a positive mood.
Analysing why they got the results they did posed a challenge to the group. Saara Suominen was the only long-term resident, and consequently, the only one to know Helsinki well enough. The others were from other countries, either as exchange students or visiting specifically for the Hackathon.
Etu-Töölö, Kamppi, Kallio and Kalasatama show on the map as areas with a negative mood, while many of the districts in north and east Helsinki are positive or neutral.
Sid Rao discusses the group’s analysis at length in his blog post.
The algorithm did not automatically show which photo was associated with the text. However, the group did manual checks of photos which had been analysed as negative based on the accompanying text. There was nothing particularly negative about the photos themselves. However, a photo of a dinner with wine had a caption referring to a “lonely past”, a landscape shot mentioned a flat bike tyre, and a cup of lemon tea was said to be a cure to a “happy hangover”.
Which neighbourhoods in Helsinki receive the most geotags? The centre of Helsinki seems disproportionately popular for geotags. This is probably because most tourists only move around and post in this small area. In addition, if a photo is just geotagged “Helsinki”, with no specific neighbourhood, such as “Töölö” or “Vuosaari”, the post is located in the centre of town in the data.
Multidisciplinary group attains astonishing results
It took some time for the multidisciplinary and international group to find their stride, but the experience was ultimately positive.
“The main thing about the Hackathon was learning to communicate in a multidisciplinary group,” said Saara Suominen when the event was about halfway through.
“I’ve noticed that typically, people in the humanities will have the questions and the scientists the methods.”
Tuomo Hiippala was impressed by the group’s morale.
The group already featured many different kinds of skills at the outset, but many of its members performed incredible feats when specific types of competence were required. They would learn new things on the fly - such as using the software that was needed in the analysis. Elias Willberg, a master’s student of geography, helped the group with the finer points of geoinformatics.
The interesting results on the emotional scale of Helsinki’s Instagram posts came almost as an afterthought from the group’s multidisciplinary work and the new skills they learned.
The group’s work also gave rise to new research questions that researchers may address in the future.
This autumn, Tuomo Hiippala is heading a study which will examine the linguistic landscape and different user groups of social media posts in the Greater Helsinki Area. What languages are used in social media? How do different language groups move through time and space? Which language groups are potentially in interaction with each other?
Tuomo Hiippala has already conducted some research into the languages of Instagram. Recently, he began a study to examine posts geotagged within 150 metres of Helsinki’s Senate Square in the Instagram dataset from 2013-2018. Most of the posts are in English, and on closer examination, he found that approximately half of all Finnish Instagram users write in English or mix different languages, typically Finnish and English.