Here you can find links to software and corpora developed by the Multimodality Research Group.
Abulafia – a tool for fair and reproducible crowdsourcing

Abulafia is a Python tool for designing complex crowdsourcing pipelines on the Toloka crowdsourcing platform.

The tool uses YAML files for designing crowdsourcing tasks and compiling them into larger pipelines using Python.

In addition to making the crowdsourcing process transparent and reproducible, the tool seeks to discourage unethical practices related to the use of crowdsourcing.

AI2D-RST – a multimodal corpus of 1000 primary school science diagrams

The AI2D-RST corpus covers a subset of the Allen Institute for Artificial Intelligence Diagrams Dataset (AI2D), which was originally developed for training and evaluating algorithms for automatic diagram processing.

The diagrams represent topics in primary and secondary school natural sciences, such as life cycles, food webs and carbon cycles.

AI2D-RST adds expert annotations for compositionality, discourse structure and connectivity to 1000 diagrams from AI2D. These annotations are represented using graphs.