A popular approach to solve multi-assembly consists of building a weighted acyclic graph from the different reads and their abundances in the mixed sample. The weights in the graph form an (approximate) flow, and the different genomic sequences (and their abundances) can be derived by decomposing such flow into a set of weighted paths explaining it.
In “Fast, Flexible, and Exact Minimum Flow Decompositions via ILP” a team formed by researchers from Montana State University and University of Helsinki (Postdoctoral
Researcher Fernando Dias and Associate Professor Alexandru Tomescu) consider the NP-hard problem of decomposing the flow into the minimum number of paths. They propose a new Integer Linear Program (ILP) encoding the exponentially many solution paths using only a quadratic number of variables. This significant result will lead to more scalable, faster, and more flexible multi-assemblers based on ILP.
Pre-print: https://arxiv.org/abs/2201.10923
In “Safety and Completeness in Flow Decompositions for RNA Assembly” researchers from Montana State University and University of Helsinki (Research Assistant Milla Kortelainen, Doctoral Researcher Manuel Caceres, Postdoctoral researcher Shahbaz Khan and Associate Professor Alexandru Tomescu) studied the problem of finding sub-paths that are common to all flow decompositions. They developed a “safe and complete” algorithm that reports all 100% precise sub-solutions that can be derived from the flow, instead of reporting one candidate solution. Their algorithm efficiently finds all safe and complete paths, and these are shown to have better quality compared to existing methods.
Pre-print: https://arxiv.org/abs/2201.10372
Both research works have been accepted to the 26th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2022) and will be published in the conference proceedings.