Why do they add vitamin D to milk? Is Aragorn more attractive than Legolas in the Lord of the Rings? On the Suomi24 forum, discussion topics range from the mundane to the outlandish. Thousands of new posts are made on the website every day. Researchers are using this vast amount of data to develop new tools to help them stay up to date and track the latest trends in Finnish online discussions.
"Suomi24 is a veritable goldmine for researchers."
“Suomi24 is a veritable goldmine for researchers,” says Professor Mika Pantzar from the National Consumer Research Centre, where the idea to study the forum first emerged.
Cooperating with Aller Media, the corporation that owns the Suomi24 forum, gave the researchers access to a massive amount of data: 70 million posts over 15 years.
“This is unique. Researchers could previously only dream of such a resource. Private companies, such as Google or Facebook, rarely give researchers access to their databases. Aller Media made an exception for us.”
Neologisms and Finnish health
The extent and diversity of the Suomi24 data provides researchers with many interesting research topics. Linguists and language technologists can use the data to study changes in the ways Finns use language and develop neologisms. Researchers interested in health and welfare can examine how Finns discuss health, their daily lives or debt.
The researchers have found that the data can also be used to study conflict management.
"The researchers are now considering whether the Suomi24 data could help them see what kinds of comments elicit hate or increase empathy in discussions."
“Recognising social tensions at an early stage is important so that we can address problems before they become full-blown rifts,” says Pantzar. The researchers are now considering whether the Suomi24 data could help them see what kinds of comments elicit hate or increase empathy in discussions.
The researcher network Citizen Mindscapes is working on the Suomi24 data. The network currently includes 200 members from both the academic world and the private sector. Together with Aller Media, the network has surveyed the research opportunities of the Suomi24 data. The cooperation has also resulted in a user interface which makes it easier to search and process the data. The Suomi24 data has been compiled into a corpus which is available through the Language Bank of Finland. New discussions are imported into the corpus at approximately six-month intervals.
“Our goal is to develop new methods and tools which improve the usability of the Suomi24 data and make it accessible by researchers from various disciplines. Now we’re eagerly awaiting to see how researchers will use the resource.”
Ask the right questions
The cooperation with Aller Media began five years ago when Mika Pantzar contacted Pauli Aalto-Setälä, CEO of Aller Media, to ask if researchers could access the Suomi24 forum.
“He said yes. Aller Media was interested to see what the academic world would have to offer them.”
And what does Pantzar think they have to offer?
“Above all, we researchers could offer companies new perspectives to their problems. Many cooperation projects fail because people focus on the wrong questions, like ‘how can we promote growth for our business’. These are not the kinds of questions researchers can answer. The job of a researcher is to ask questions that the company itself hasn’t even thought of. This makes the cooperation fruitful and prevents the waste of either party’s resources,” says Pantzar.
New ideas emerged while the researchers were working on the Suomi24 data.
“Moderating discussion forums and comments is a major challenge for media companies, as manual moderation is very expensive. Aller Media is going to adopt automatic moderation provided by Utopia Consulting that was developed by a network member. This is a wonderful example of how cooperation between companies and the University can lead to sharing information and new innovations.”