AI boom requires new benchmarks for natural language understanding

Aarne Talman’s timely research delves into the meaning of language understanding, its measurement and the weaknesses of current measurement benchmarks.

Aarne Talman’s work helps us develop better benchmarks and, thus, better AI models, while also allowing us to devise new AI models that better mimic human language understanding.

Current benchmarks unable to measure language understanding capabilities of AI models

Talman says that current benchmarks measuring the language understanding capabilities of AI models do not actually measure what they claim to, as the models are able to perform any tasks assigned to them by relying on other patterns in datasets.

As part of his research, Talman not only assessed benchmarks for language understanding, but also developed methods to enhance AI models’ language understanding.

“To the best of my knowledge, I was the first to apply the Stochastic Weight Averaging Gaussian (SWAG) method in the context of language understanding,” Talman says. “This method enables the development of AI models with language understanding capabilities that better capture the uncertainty involved in human language understanding.”

In his research, Talman also clarifies concepts of language understanding and opens up discussion on what requirements AI models must meet for us to say that they understand natural language.

Current benchmarks have been used to compare AI models in terms of their language understanding capabilities.

Talman also discusses the nature of language understanding more generally and considers the extent to which AI models are able to understand language.

“Can we say that an AI model actually understands the language it reads?” he asks.

AI will play (and is already playing) a major role in our society. Language understanding is one of the cornerstones of intelligence.

“It’s important that we’re able to develop better AI models that more closely match human language understanding. To do so, we must grasp what language understanding means and how it can be measured.”

Timely research on the capability of AI models to understand natural language

Aarne Talman, MSc, will defend his doctoral thesis Towards Natural Language Understanding: Developing and Assessing Approaches and Benchmarks on 23 February at 13.15 in the Doctoral Programme in Language Studies at the Department of the Digital Humanities of the University of Helsinki’s Faculty of Arts.

The public examination will take place in Banquet Room 303 at Unioninkatu 33. The event can also be attended via live stream.