Discovering new patterns in scientific literature and news data

Dorota Glowacka and Alan Medlar (University of Helsinki) will talk about their research on using topic models to analyse scientific literature.

This work stems from a concern that the content and/or structure of scientific articles could impact the results of interactive information retrieval experiments. They will show that how well an abstract summarises the full-text is subfield-specific, which could adversely affect perceived retrieval performance. Glowacka and Medlar will demonstrate that this is not random, but a consequence of style and writing conventions in different disciplines. Indeed, these features can be used to infer an “evolutionary” tree of subfields within Computer Science. Finally, Glowacka and Medlar will touch upon their current work on using neural language models to identify political bias in online newspapers.