Workshop on Data Structures in Bioinformatics

The 4th Workshop on Data Structures in Bioinformatics took place May 16-17 in Helsinki.

The timing could not have been better for hosting a workshop in the centre of Helsinki: City full of happy people enjoying the sun at a rare May weather of over 25 degrees celsius. On Tuesday evening, we welcomed the participants of the Workshop on Data Structures in Bioinformatics with some drinks and snacks in restaurant Kaisla

Wednesday program started at Minerva Square (University of Helsinki Learning Centre Minerva) with a session on pan-genomics: Jasmijn Baaijens gave a talk on viral quasispecies reconstruction using variation graphs and Daniel Valenzuela continued on pan-genomic references for variant calling. After a break, a k-mer session took place with Pall Melsted proposing how to choose your k-mers wisely, Sven Rahmann leading us towards ultrafast hashing of k-mers, and Jarno Alanko considering compact representation of Markov models with variable length k-mers.  After the lunch break, we continued with a de Bruijn graph session: Guillaume Holley considered scalable construction of a colored de Bruijn graph, Alan Kuhnle a dynamic de Bruijn graph, and Pierre Morisse read error correction using a variable-order de Bruijn graph. Workshop dinner took place in restaurant Elite.

Thursday program started with a human genomics session: Tony Cox gave insight to the usage of unmapped sequences and Hannes P. Eggertsson presented Graphtyper tool for population-scale genotyping using pangenome graphs. After a break, a data structure session took place with Eric Rivals talking about hierarchical overlap graphs, Johannes Fischer about distributed suffix array construction, and Sofia Teixeira about phylogenetic trees. After lunch, we continued with a session on Burrows-Wheeler Transform (BWT). Nicola Prezza considered mutation detection from BWT, Bastien Cazaux gave a connection between Aho-Corasick automaton and BWT, and Veli Mäkinen introduced the use of positional BWT for founder reconstruction.  

The local organizers of the event were Simon Puglisi, Veli Mäkinen and Leena Salmela. The workshop aims to complement the conferences with a more relaxed forum with opportunity to discuss the latest on-going work. To facilitate this, the two days program started late, ended early, and had long breaks. Talks took typically half an hour, but there was no urge to stop. 

The workshop runs in round-robin fashion, with the organizers handling the local costs. We are greatly thankful to our sponsors: Foundations of Computational Health programme of HIIT and the BIRDS Project funded by the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 690941.