Matti Pirinen's Software

Back to the main page.

The use of the attached source codes on this page is allowed under the terms of the GNU general public license (version 3).
FINEMAP and metaCCA have their own licenses.

More recent software including linemodels package can be found from GitHub.

biMM: Efficient estimation of genetic variances and covariances for cohorts with high-dimensional phenotype measurements.

Publication in Bioinformatics (2017) and the R-package biMM_1.0.0.tar.gz (updated 3 Mar 2017) with a vignette.

FINEMAP: Efficient variable selection using summary data from genome-wide association studies.

Publication in Bioinformatics (2016) and the software.

metaCCA: Multivariate meta-analysis of genome-wide association studies using canonical correlation analysis

Publication in Bioinformatics (2016) and the software.

Allele specific expression from RNA-seq read count data for multiple tissues

These source codes implement the models described in paper Assessing allele specific expression across multiple tissues from RNA-seq read count data Bioinformatics 2015.
R-codes plus data sets (61 MB); or just the R-codes for functions and examples.
For a user-friendly interface for the same models see MAMBA by Manuel Rivas.

Linear mixed model software MMM

MMM is a software package for analysing a linear mixed model with one random effect whose covariance structure can be freely specified by the user. It is written with large data sets in mind: applied to real data sets where hundreds of thousands of predictors on over 20,000 individuals are tested one-by-one. Motivation for MMM came from genome-wide association studies, but it can be used with other data as well. Written in C (GNU-C) and uses GSL and, preferably, LAPACK libraries.
Current version 1.01 (updated 10-Feb-2014): Software. Manual.

Pirinen M, Donnelly P and Spencer CCA (2012):
Efficient Computation with a Linear Mixed Model on Large-scale Data Sets with Applications to Genetic Studies.
Ann Appl Stat 7(1): 369-390.
Text and Supplementary text available.

Non-confounding covariates in logistic regression

R-functions to assess the effect of a single binary or continuous covariate on power for detecting genetic variants (of small effects) in GWAS. Includes as examples the codes for plotting Supplementary Figures of the publication. R-functions. Examples.

Pirinen M, Donnelly P and Spencer CCA (2012):
Including known covariates can reduce power to detect genetic effects in case-control studies.
Nat Genet 44: 848-851.

Hippo and AEML

Haplotype estimation using incomplete prior information from pooled observations (Hippo), and
Approximate EM-algorithm with list of known haplotypes (AEML) are in a
Tar-package.
These algorithms estimate population haplotype frequencies from pooled SNP data and can make use of a list of the haplotypes that are known to exist in the population. Written in ANSI-C using Gnu Scientific Library (GSL v.1.0.0). For description see:

Pirinen M (2009):
Estimating population haplotype frequencies from pooled SNP data using incomplete prior information.
Bioinformatics 25(24):3296-3302.

APE

Allelic Path Explorer (APE)
As a Tar-package.
APE is a program for extending partially known genotype data on a given pedigree consistently (i.e. in accordance with the Mendelian rules) to the whole pedigree. Written in ANSI-C. For details, see:

Pirinen M and Gasbarra D (2006):
Finding Consistent Gene Transmission Patterns on Large and Complex Pedigrees.
IEEE/ACM Trans. on Computational Biology and Bioinformatics 3(3):252-262.