The K-Pax2 page

K-Pax2 is a R package, implementing a Bayesian model-based method for simultaneously classifying rows and columns of a categorical data matrix. Main application of K-Pax2 is with genetic datasets, such as dna/protein multiple sequence alignments. Being a general method, it can be easily applied to any kind of categorical dataset. It is based on the same idea of the previous K-Pax, but the different statistical model provides now even better accuracy (thanks also to the new stochastic search algorithm).

Details of the method are given in the paper:

Pessia A., Grad Y., Cobey S., Puranen J., Corander J. (2015). K-Pax2: Bayesian identification of cluster-defining amino acid positions in large sequence datasets. MGen 1(1). doi: 10.1099/mgen.0.000025

The K-Pax2 package requires R version 2.15 (or newer).

Download K-Pax2 1.0.1.

The zip file contains the R package, a README file with installation instructions, a LICENSE file, and a tutorial. A dataset of 200 HIV-1 "Env" gene protein sequences, to be analyzed during the tutorial, is also provided.

Source code available on GitHub.