Brown Bag Seminar

Brown Bag Seminar meetings every Wednesday.

The Methodological Unit organizes a weekly Brown Bag Seminar to highlight novel methodological approaches in humanities and social sciences. The idea of the meetings is to introduce methodological innovations and cutting-edge research in various disciplines in an easily accessible manner and have an interdisciplinary discussion in an easy-going atmosphere over lunch. Bring your own lunch, we bring fresh methodological topics!

Every Wednesday at 12.15.

You are welcome to join us at seminar room 524, Fabianinkatu 24 A (access via door, not courtyard), 5th floor, or online via Zoom.

The Idea

There will be a 20-minute introduction to the methodological theme, followed by an open discussion of 40 minutes. The seminars are open to everybody. We expect a multidisciplinary and methodologically curious audience from different faculties and units of the central campus. The language of the meetings can be Finnish or English.

The most important prerequisite for participation is not methodological expertise, but an open mind towards new methodological innovations and discussion across methodological and disciplinary boundaries.

The Program

Scroll down for the upcoming program of Brown Bag Seminars. To get notified on updates sign up for our mailing list or follow us on social media. Click here for more information on our communication channels.

Click here for more information on past Brown Bag Seminar and Brown Bag Lunch events.

19.3.2025 Pertti Alasuutari

National Parliaments as a Global Institution

In this talk I will discuss the world’s parliaments as a global institution. By it I mean, for instance, that in their structures and practices, legislatures are very similar. It is evident, say, in that opening words of a session can be verbatim copies of those in other parliaments. Through such isomorphic, invariant practices, legislatures symbolize, sanctify, and naturalize the nation-state as the component part of world society. The global standardization of parliamentary practices has also constituted what is the world over understood by politics, radiating to the forms that political organization into parties, ideologies and movements has assumed. Furthermore, the invariant rules observed constitute parliamentary practices as rituals. In addition to the sanctity of rational, evidence-based reasoning embodied in the authority of science, there are four sacred principles that define national parliaments as an institution: national sovereignty, parliamentary immunity, the national interest or common good, and sanctity of fellow parliamentarians. These principles are honored in frontstage sessions, whereas bargaining between conflicting group interests takes place in backstage occasions. Yet frontstage performances are important because they frame the issues and justify the decisions taken by commonly approved virtues and principles. In addition to passing (or proposing) laws, national parliaments legitimate rule and serve as switchboards in global governance and travel of ideas.

Pertti Alasuutari is professor emeritus of sociology at Tampere University. Renowned for his textbooks in qualitative research, Researching Culture, Sage 1995, and An Invitation to Social Research, Sage 1998, during the past 20 or so years he has focused on studying politics and policymaking from a global perspective. His research interests also include media and social theory.

Click here for practical information on the Brown Bag Seminar events.

26.3.2025 Aleksi Knuutila

LLM-assisted topic modeling of noisy data: Benchmarking results and a case study of disguised propaganda from Ukrainian Telegram

In recent years, the social sciences and humanities have experimented with incorporating instruction-tuned large language models (LLMs) into their methods. Though opinions about their capabilities vary, there has been interest in substituting existing approaches in data annotation and analysis with LLM-based approaches. Much of the interest in LLMs thus far has focused on their accuracy in classification tasks and lowering research costs, partly through obviating the need for manually classified training data via zero-shot learning. At the same time, the current application of these models has been compared to an "academic Wild West" due to a lack of benchmarks or shared best practices for reliable use.

This presentation explores how LLMs could complement existing analytical methods. In particular, I test their ability to enable topic modelling on noisy text corpora where only a subset of the text is relevant to the research agenda. I illustrate the approach with a case study from the project [Eyewitness images in the war in Ukraine](https://researchportal.helsinki.fi/fi/projects/eyewitness-images-and-networked-…). This research compares Ukrainian Telegram channels in terms of how they cover the Ukrainian military operations in Kursk. Informed by domain knowledge and theoretical interests, we extract text segments related to military events and ascribed motivations from longer Telegram messages before applying established topic modelling approaches to the segments. Secondly, I test the reliability of this approach for topic modelling by comparing its results against large human-annotated benchmark datasets. The results suggest that one function for LLMs in social research could be in enabling flexible forms of feature selection (such as selecting text segments) to make complex datasets legible for established research methods.

Dr Aleksi Knuutila is a University Researcher at the Department of Sociology at the University of Helsinki and the Helsinki Institute for Social Sciences and Humanities. After his doctorate in the Digital Anthropology programme at University College London, Knuutila’s research has focused on online harms such as misinformation and harassment and how political groups take advantage of contemporary information environments. His current research projects focus on developing tools and infrastructure for journalists working on conflicts and applying generative AI to interpretative research workflows.

Click here for practical information on the Brown Bag Seminar events.

2.4.2025 Friederike Lüpke

Neural machine translation and language description & language documentation: shared data and methods?

Nature (NLLB team 2024) reports big progress in neural machine translation (NMN) and projects its ability to upscale to large numbers of languages for which only limited training text is available, without compromising quality. I investigate new proposals for low-resource languages, particularly those not written in formal contexts or containing multilingual ‘code-switched’ text. Existing models rely on users of these languages to translate text, but this results in highly unnatural data, so-called 'translationese' or use of very limited corpora, for instance Bible translations, which represent restricted domains of language use and are culturally heavily biased (Kuwanto et al. 2024). New proposals overcome these weaknesses through using semantically grounded multilingual written and spoken language (SLU) and a focus on cross-linguistic transfer of learning based on similarity for NMN. This is complemented by storyboard methods, where language users retell content presented as visual stimuli, thus preventing translationese. Similar information is collected by typologists, who investigate shared constructions across languages, or field linguists, who collect data with nonverbal stimuli. Can linguists and AI enter fruitful collaborations also benefitting users of low resource languages, and can NMN models based on training data provided by linguists also improve linguistic theories, or is this hope futile?

Friederike Lüpke is Professor of African Studies and chair of AfriStadi, the Africa Research Forum for Social Sciences and Humanities at the University of Helsinki. Her research focuses on language description and documentation in multilingual settings in West Africa and on small-scale multilingualism worldwide. She is committed to an epistemological and methodological renewal of these disciplines so that they represent and benefit from global perspectives and are able to account more fully for richness and diversity of language use and language ideas.

References:

Kuwanto, Garry; Urua, Eno-Abasi E.; Amuok, Priscilla Amondi; Muhammad, Shamsuddeen Hassan; Aremu, Anuoluwapo; Otiende, Verrah et al. (2024): Mitigating translationese in low-resource Languages: The storyboard approach. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 11349–11360. Available online at http://arxiv.org/pdf/2407.10152.

NLLB team (2024): Scaling neural machine translation to 200 languages. In Nature 630 (8018), pp. 841–846. DOI: 10.1038/s41586-024-07335-x

 

Click here for practical information on the Brown Bag Seminar events.

9.4.2025 Marion Godman

Should data on ethnicity be collected in Europe? A philosophical-experimental approach

Marion Godman and Nicholas Haas

Collecting data on ethnicity (and often also race) is widespread globally and often regarded as a way of tracking and mitigating discrimination and other forms of inter-group inequalities. Not so in Europe (e.g. Simon 2012; European Commission 2017). Most European countries have opted to exclude not only race, but also ethnic categories from national censuses or population registers.

In this paper, we argue, that there are several hitherto overlooked both moral and epistemic costs of not collecting data on ethnicity.

We first respond to the idea that in fact there is no dearth of data at all since ethnicity is already accounted for by more generally acceptable categories like immigrant, or country of birth. We argue that these are not at all obvious proxies at all because they either entirely miss or fail to distinguish epistemically relevant information. Further, we highlight how the use of alternative categories to ethnicity can lead to certain “slippages” or ambiguities in meaning: where concepts like “immigrant”, function as “code” for different ethnic or racial categories, while also retaining a more literal and often more encompassing interpretation.

In addition to scrutinizing the epistemic and moral arguments, we adopt an experimental philosophy approach to addressing the by conducting small experiments with online respondents. First, we explore whether individuals use “immigrant” as a proxy for “Muslim” (or “Arab”) when deciding whether to discriminate or not. Second, we experimentally evaluate whether individuals provide differential support for the same arguments when they concern gender as opposed to ethnicity to test whether ethnicity is indeed a more sensitive category that should not be probed or registered (as is commonly assumed).

Marion Godman is Associate Professor at the Department of Political Science at Aarhus University and an affiliated scholar of the History and Philosophy of Science department, Cambridge University. Between 2012 and 2018 she was also based at Helsinki University working at TINT/Centre of Excellence in Philosophy of the Social Sciences. She works on a range of issues that concerns the philosophy of the human and social sciences and in political philosophy and endeavours to find a synthesis between these different areas as can be seen in her research monograph, The Epistemology and Morality of Human Kinds (2020, Routledge).

Click here for practical information on the Brown Bag Seminar events.

28.5.2025 Desmond Elliott

Automatically Processing Historical Documents without OCR

The digitisation of historical documents has provided historians with unprecedented research opportunities. Yet, the conventional approach to analysing historical documents involves converting them from images to text using OCR, a process that overlooks the potential benefits of treating them as images and introduces high levels of noise. To bridge this gap, we take advantage of recent advancements in pixel-based language models trained to reconstruct masked patches of pixels instead of predicting token distributions. Due to the scarcity of real historical scans, we propose a novel method for generating synthetic scans to resemble real historical documents. We then pre-train our model, PHD, on a combination of synthetic scans and real historical newspapers from the 1700-1900 period. Through our experiments, we demonstrate that PHD exhibits high proficiency in reconstructing masked image patches and provide evidence of our model’s noteworthy language understanding capabilities. Notably, we successfully apply our model to a historical QA task, highlighting its usefulness in this domain. 

Desmond Elliot is an Associate Professor and a Villum Young Investigator at the University of Copenhagen. His group currently focuses on tokenization-free language modelling, and multilingual and multimodal processing. his research output includes widely used resources and tools such as the multilingual image description dataset (multi30K), the multimodal language understanding dataset (How2), and the pixel-based language model PIXEL.

Click here for practical information on the Brown Bag Seminar events.