mtbseq-nf pipeline makes MTBseq simple and easy to use via Nextflow workflow manager.
- Fine-grained control over resource allocation (CPU/Memory/Storage)
- Reliance of bioconda and biocontainers for installing packages for reproducibility
- Ease of use on a range of infrastructure (cloud/on-prem clusters/local machine)
- Resumability for failed processes
- Centralized locations for specifying
- MTBseq parameters (
conf/global_parameters.config) - Hardware requirements (
conf/standard.config) - Software requirements (
conf/docker.configorconf/conda.config)
- MTBseq parameters (
- Dedicated user interface for all parameters for wider audience (
nextflow_schema.json). This only works on Nextflow Tower. - Easier customizability for the pipeline, using explicit parameters (
conf/global_parameters.config). - Ability to analyze genomes in parallel as well as in batch, on
local,cloudandclusterenvironments.
The simplest use case is to analyze a few genomes on a local environment. Almost all aspects are customizable but for the sake of brevity, a bare bones guide for any beginner user is as shown below
- 1. Clone the project
git clone https://github.com/mtb-bioinformatics/mtbseq-nf
cd mtbseq-nf-
2. Download the
gatk-3.8.0tar from here -
3. Untar it and place it in the
resourcesfolder
tar -xvf GATK_TAR_FILE
- 4. Move your genomes to the
data/full_datafolder
They should follow the pattern SAMPLE_R1.fastq.gz
- 5. To run the pipeline, make sure you have
condainstalled. Moreover, if you don't already havenextflowinstalled, you can use the following commands to install it
conda create -n mtbseq-nf-env -c bioconda -c conda-forge nextflow You can confirm the setup by activating that environment and using the nextflow info command
conda activate -n mtbseq-nf-env
nextflow info
- 6. Then simply issue the following command on the command line
nextflow run main.nf -profile standard,conda
This pipeline has two execution types: batch and parallel and here is a dag example for them!
The execution type is determined by the analysis_mode parameter
Contributions are warmly accepted!
The insipiration for this project itself MTBseq has a GPL-3 license as of v1.0.3.
The components related to mtbseq-nf project itself (the Nextflow wrapper code) are licensed under the liberal MPL-2.0 license.
We would like to Thank the developers of MTBseq for putting in the intial effort!

