The tool is freely available at
We developed a linear time algorithm to segment a multiple alignment into founder blocks. A founder block is a shared genome segment. The algorithm is given the minimum length of a block, and it finds the optimal segmentation such that the maximum of the number of distinct blocks required is minimized over all segments. If this maximum is k, then one can combine the blocks in arbitrary way to form k founder sequences, such that all rows of the input multiple alignment can be read through the founders block by block, possibly jumping at block boundaries. We also provide a bipartite matching based heuristic to minimize the number of jumps (discontinuities).
Founders can be used for compactly representing the pangenome of virus strains. To illustrate this, we constucted the MSA of 1484 strains of
A simple derivative of founder block computation is the creation of founder block graphs: nodes are founder blocks and edges are between consecutive blocks that are supported by at least one row of the original MSA. In our example illustration above, we would have 9 nodes and 11 edges. We provide a simple
If you use the tool, kindly cite us as follows:
Tuukka Norri, Bastien Cazaux, Dmitry Kosolobov, Veli Mäkinen:
Linear time minimum segmentation enables scalable founder reconstruction. Algorithms Mol. Biol. 14(1): 12:1-12:15 (2019). Earlier in WABI 2018.