Installation
Installation guide

CSC module

Lazypipe is available as a preinstalled module on Puhti server at the Finnish Center of Scientific Computing (CSC).

For usage instuctions please refer to the CSC documentation.

Note: we recommend to reserve at least 4GB of RAM for each core when running Lazypipe.

How to install lazypipe

Clone the repository

Start by cloning Lazypipe repository:

git clone https://plyusnin@bitbucket.org/plyusnin/lazypipe.git
cd lazypipe

Install Binaries with Conda

conda create -n blast -c bioconda blast conda create -n lazypipe -c bioconda -c eclarke bwa centrifuge csvtk fastp krona megahit mga minimap2 samtools seqkit spades snakemake-minimal taxonkit trimmomatic numpy scipy fastcluster requests

Or from conda yaml files:

conda env create -f blast.yml
conda env create -f lazypipe.yml

This will create separate conda environment for blast. All other tools are installed under lazypipe. To activate all installed binaries type:

conda activate blast
conda activate --stack lazypipe

Set taxonomy database location for KronaGraph (you may need to replace $CONDA_PREFIX and $data according to your settings):

rm -rf $CONDA_PREFIX/conda/env/lazypipe/opt/krona/taxonomy ln -s $data/taxonomy $CONDA_PREFIX/conda/env/lazypipe/opt/krona/taxonomy

Set env variable $TM to point to trimmomatic directory:

export TM=$CONDA_PREFIX/share/trimmomatic

Download PANNZER (version 02/2022 or later) and set runsanspanz.py as executable to your path:

wget http://ekhidna2.biocenter.helsinki.fi/sanspanz/SANSPANZ.3.tar.gz tar -zxvf SANSPANZ.3.tar.gz sed -i "1 i #!$(which python)" SANSPANZ.3/runsanspanz.py ln -sf $(pwd)/SANSPANZ.3/runsanspanz.py ~/bin/runsanspanz.py

Install Perl modules

Install modules to local-lib ~/perl5

cpan --local-lib=~/perl5 File::Basename File::Temp Getopt::Long YAML::Tiny
export PERL5LIB=~/perl5/lib/perl5:$PERL5LIB

Install R libraries

Open R console and type

install.packages( c("reshape","openxlsx") );

Install local databases

Open config.yaml and set local paths for taxonomy, blastp_db, blastn_vi_db and minimap_db.

Start by installing NCBI taxonomy database:

perl perl/install_db.pl --db taxonomy

Then install blastn_vi_db database:

perl perl/install_db.pl --db blastn_vi

To use Lazypipe annotations with minimap2, download and unpack the latest NCBI nt archaea+bacteria+viruses (link in Table 1).

To use Lazypipe with Centrifuge, download and unpack your preferred Centrifuge index.

 

URL Local path (config.yaml) Installation Description
NA

blastp_db

See NCBI manual NCBI BLAST nr database
blastn_vi_db_url blastn_vi_db perl/install_db.pl --db blastn_vi NCBI GeneBank Viruses Complete genomes
fairdata link centrifuge_db download and unpack data/nt_2021_12_habv_cent.tar.gz

centrifuge index with Hsapiens_GRCh38p13 assembly + bacteria + archaea + virus sequences from NCBI nt database

fairdata link minimap_db download and unpack data/2022_11_04.nt_abv.tar.gz Archaeal, bacterial and virus sequences from NCBI nt database
taxonomy_url taxonomy perl/install_db.pl --db taxonomy NCBI Taxonomy database

Table 1. Summary on databases used by Lazypipe.

Test Perl and Snakemake interfaces

You are now all set up. To test Perl interface type:

perl lazypipe.pl

To test Snakemake interface type:

snakemake -np all

Complete list of Lazypipe dependencies

Tool Website Download binaries

Original article

[blast] https://blast.ncbi.nlm.nih.gov/ blast+/LATEST/ https://doi.org/10.1186/1471-2105-10-421
bwa-mem https://github.com/lh3/bwa

bio-bwa/files

https://arxiv.org/abs/1303.3997
[Centrifuge] https://ccb.jhu.edu/software/centrifuge/ centrifuge-1.0.3-beta-Linux_x86_64.zip https://doi.org/10.1101/gr.210641.116
csvtk https://bioinf.shenwei.me/csvtk/ csvtk/download  
fastp https://github.com/OpenGene/fastp http://opengene.org/fastp/fastp https://doi.org/10.1093/bioinformatics/bty560
KronaTools https://github.com/marbl/Krona/wiki/KronaTools NA https://doi.org/10.1186/1471-2105-12-385
MEGAHIT https://github.com/voutcn/megahit/ IMEGAHT-1.2.9-Linux-x86_64-static.tar.gz https://doi.org/10.1016/j.ymeth.2016.02.020
MGA http://metagene.nig.ac.jp/metagene/ http://metagene.nig.ac.jp/metagene/download_mga.html https://doi.org/10.1093/nar/gkl723
[minimap2] https://github.com/lh3/minimap2 minimap2-2.24_x64-linux.tar.bz2 https://doi.org/10.1093/bioinformatics/bty191
PANNZER/SANS http://ekhidna2.biocenter.helsinki.fi/sanspanz/ SANSPANZ.3.tar.gz https://doi.org/10.1002/pro.4193
TaxonKit https://bioinf.shenwei.me/taxonkit/ taxonkit/releases/tag/v0.9.0 https://doi.org/10.1016/j.jgg.2021.03.006
[Trimmomatic] https://github.com/usadellab/Trimmomatic v0.39.tar.gz https://doi.org/10.1093/bioinformatics/btu170
Samtools http://www.htslib.org/ samtools-1.14.tar.bz2 https://doi.org/10.1093/gigascience/giab008
SeqKit https://bioinf.shenwei.me/seqkit/ seqkit_linux_amd64.tar.gz https://doi.org/10.1371/journal.pone.0163962
[Snakemake] https://snakemake.readthedocs.io/ NA https://doi.org/10.12688/f1000research.29032.2
[SPAdes] https://github.com/ablab/spades SPAdes-3.15.3-Linux.tar.gz https://doi.org/10.1002/cpbi.102
Perl modules File::Basename, File::Temp, Getopt::Long , YAML::Tiny    
R reshape   https://doi.org/10.18637/jss.v021.i12
R openxlsx    

Table 2. Lazypipe dependencies. Tools in square brackets mark binaries that are not required for basic Lazypipe runs. When installed, these will provide additional options/functionalities.