Riku Walve defends his PhD thesis on Improving Contiguity and Accuracy in Genome Assembly

On Friday the 31st of March 2023, M.Sc. Riku Walve defends his PhD thesis on Improving Contiguity and Accuracy in Genome Assembly. The thesis is related to research done in the Department of Computer Science and the Algorithms for Biological Sequencing Data team of the Algorithmic Bioinformatics group.

M.Sc. Riku Walve defends his doctoral thesis Improving Contiguity and Accuracy in Genome Assembly on Friday the 31st of March 2023 at 12 o'clock in the University of Helsinki Exactum building, Auditorium CK112 (Pietari Kalmin katu 5, Basement). His opponent is Associate Professor Cinzia Pizzi (University of Padua, Italy) and custos Professor Veli Mäkinen (University of Helsinki). The defence will be held in English. 

The thesis of Riku Walve is a part of research done in the Department of Computer Science and in the Algorithms for Biological Sequencing Data team of the Algorithmic Bioinformatics group. His supervisor has been Docent, University Lecturer Leena Salmela (University of Helsinki).

Improving Contiguity and Accuracy in Genome Assembly

Though genome analysis is used in other places, understanding the effects genes have on humans is arguably its most significant use. A fundamental roadblock to genome analysis is the fact that genomes cannot be sequenced in their entirety. Instead, only short sequences filled with errors can be read from genomes called reads. An important step in analyzing genomes is then assembling the reads into the full genome.

This thesis looks at both the problems of correcting errors and assembling the genomes. Error correction on the reads can be done by constructing a multiple sequence alignment over the set of reads. Multiple sequence alignment has to be approximated in order to efficiently correct the errors.

Guided genome assembly is a variation on genome assembly, where we are additionally given data describing some structural information on the genome. This thesis describes two guided genome assembly methods. One is based on the idea of using linear location information and the other is using a more general framework by clustering the reads. 

Finally, this thesis reconsiders the problems of genome analysis from the perspective of optical maps. Specifically, the problem of efficient indexing is evaluated in the context of optical maps, as the data looks fundamentally different. Optical maps represent the genomes as lengths between cuts, rather than nucleotides.

Avail­ab­il­ity of the dis­ser­ta­tion

An electronic version of the doctoral dissertation is available on the e-thesis site of the University of Helsinki at http://urn.fi/URN:ISBN:978-951-51-8984-4.

Printed copies will be available on request from Riku Walve: riku.walve@helsinki.fi.