Future search engines will help users find information they don’t even know they are looking for

Seeking information online does not have to be limited to googling – at least not if you ask researchers. Dorota Głowacka is investigating alternative methods of information retrieval.

The research surrounding methods of information retrieval is an entire field of science whose specialists aim to provide us with even better search results – a necessity as the amount of data constantly keeps growing.

To succeed in their quest, researchers are focusing on the interaction between humans and computers, connecting methods of machine learning to this interaction. One of these researchers is Dorota Głowacka, who assumed an assistant professorship in machine learning and data science at the Helsinki Centre for Data Science HiDATA at the beginning of 2019.

Głowacka is studying what people search for and how they interact with search engines, with a particular focus on exploratory search. This is a search method that helps find matters relevant to the person looking for information, even if they are not entirely certain about what they are looking for to begin with.

Google does, in fact, anticipate search terms and screen results, but that is where it stops; rather, it assumes your search term produced the desired result.

A search engine based on exploratory search would categorise search results, present information not directly connected with but related to the search, as well as propose new search terms. In addition, it would personalise search results depending on the user, but in a manner much more sophisticated compared to Google.

This would be useful, for example, to scholars weeding through scientific literature.

“Articles concerning physics, mathematics and machine learning may discuss the same theme using different terms. But how can mathematicians find articles from other fields discussing matters relevant to them? This is exactly what exploratory search is. We wish to study search results while narrowing down searches and gaining an understanding on what people are actually looking for,” Głowacka explains.

Compulsory selection

Those developing new kinds of search engines are relying on a broad range of data pertaining to human-computer interaction, such as clicks, eye movements and time spent on individual webpages.

The identity of the person looking at the display screen also matters.

For instance, a student searching for literature needs something different from a professor assessing a conference presentation: the student may gain the most benefit from introductory articles, while the professor is in need of more advanced, clearly defined information.

“Even with identical search terms, an exploratory search would give them different search results,” Głowacka notes.

A giant in the way

Exploratory search sounds great, but it remains outside the mainstream. For now, it is only used in certain search engines specialised in scientific literature, in special library collections and archives.

According to Głowacka, this is largely due to our conditioning to use Google.

“At the moment, Google’s method of information retrieval prevails. Google also has taught users to expect a list of links ranked according to how well they match the search terms. When the results comprise unsuitable hits, people have become used to amending their search terms by themselves.”

It is precisely our habituation to Google which Głowacka believes is preventing the wider use of new methods.

“And yet, I believe that the mainstream search engines will gradually begin making changes to the method of information retrieval. This is a result of the increasing amount of information available which current methods are unable to effectively locate, such as multimedia. In addition, more and more consumers are beginning to understand that the current search methods do not always produce the best results.”

Stepping outside the bubble

Among the current hot topics in research on information retrieval is the extent to which search results should be modified according to the user. Such modification is a must, as there are no search engines able to present all the data available online.

Then again, excessive personalisation can result in what are known as filter bubbles. Already now, Google is selecting certain types of search results that it considers relevant to you on the basis of your previous searches. For someone else, the exact same search term will produce different results.

This is also the operating principle of social media algorithms.

“Personalisation is important, but filter bubbles must not be allowed to spring up. If you only see certain media content on Facebook, you start believing that this is what the world looks like. And that’s dangerous,” says Głowacka.

According to Głowacka, interactive search engines could prevent the formation of filter bubbles, as they would force us to see content we would not choose to view ourselves.

“People would still get the desired results, only added with other content. This may lead you to click on something that will change your mind and make you embark on new paths.”

Read more:

Exploratory Search and Personalisation research group

In­tro­du­cing the new ex­perts of HiDATA

This series will introduce new professors in the tenure track system of the University of Helsinki working at the Helsinki Centre for Data Science. 

Other parts of the series:

Laura Ruotsalainen, associate professor of spatio-temporal data analysis: People in motion help planners design better cities

Keijo Heljanko, professor of parallel and distributed data science: In­creas­ing masses of data may leave com­puters be­hind and cause an en­ergy crisis

Kai Puolamäki, associate professor of data science and atmospheric sciences: Data science in­ter­prets at­mo­spheric particles and helps find the clean­est urban routes – if we know what to ask com­puters

Nikolaj Tatti, associate professor of privacy-aware and secure data science: Data science may soon ex­pose fake news

Antti Honkela, associate professor of machine learning and data science: Every­one has their secrets – ma­chine learn­ing needs to re­spect pri­vacy

Dorota Głowacka
  • Assistant professor of machine learning and data science at the Helsinki Centre for Data Science HiDATA. Worked at the University of Edinburgh before the University of Helsinki
  • Originally a linguist whose doctoral dissertation (University College London, 2012) focused on reinforcement learning, an area of machine learning
  • Currently investigating information retrieval methods as well as interaction between humans and computers