Nextflow pipeline

Introduction

The pipeline is built using Nextflow, a workflow manager to run tasks across multiple compute infrastructures in a very portable manner. It supports conda package manager and singularity / Docker containers making installation easier and results highly reproducible.

Pipeline summary

Quick help

nextflow run main.nf --help
N E X T F L O W  ~  version 19.10.0
Launching `main.nf` [stupefied_darwin] - revision: aa905ab621
=======================================================

Usage:

Mandatory arguments:
--reads [file]                   Path to input data (must be surrounded with quotes)
--samplePlan [file]              Path to sample plan file if '--reads' is not specified
--genome [str]                   Name of the reference genome. See the `--genomeAnnotationPath` to defined the annotation path
-profile [str]                   Configuration profile to use (multiple profiles can be specified with comma separated values)

Inputs:
--design [file]                  Path to design file for extended analysis
--singleEnd [bool]               Specifies that the input is single-end reads

Skip options: All are false by default
--skipSoftVersion [bool]         Do not report software version
--skipMultiQC [bool]             Skip MultiQC

Other options:
--metadata [dir]                Add metadata file for multiQC report
--outDir [dir]                  The output directory where the results will be saved
-w/--work-dir [dir]             The temporary directory where intermediate data will be saved
-name [str]                      Name for the pipeline run. If not specified, Nextflow will automatically generate a random mnemonic

=======================================================
Available profiles
-profile test                    Run the test dataset
-profile conda                   Build a new conda environment before running the pipeline. Use `--condaCacheDir` to define the conda cache path
-profile multiconda              Build a new conda environment per process before running the pipeline. Use `--condaCacheDir` to define the conda cache path
-profile path                    Use the installation path defined for all tools. Use `--globalPath` to define the installation path
-profile multipath               Use the installation paths defined for each tool. Use `--globalPath` to define the installation path
-profile docker                  Use the Docker images for each process
-profile singularity             Use the Singularity images for each process. Use `--singularityPath` to define the path of the singularity containers
-profile cluster                 Run the workflow on the cluster, instead of locally

Quick run

The pipeline can be run on any infrastructure from a list of input files or from a sample plan as follows:

Run the pipeline on a test dataset

See the file conf/test.config to set your test dataset.

nextflow run main.nf -profile test,conda

Run the pipeline from a `sample plan` and a `design` file

nextflow run main.nf --samplePlan mySamplePlan.csv --design myDesign.csv --genome 'hg19' --genomeAnnotationPath /my/annotation/path --outDir /my/output/dir

Defining the '-profile'

By default (whithout any profile), Nextflow excutes the pipeline locally, expecting that all tools are available from your PATH environment variable.

In addition, several Nextflow profiles are available that allow:

the use of conda or containers instead of a local installation,
the submission of the pipeline on a cluster instead of on a local architecture.

The description of each profile is available on the help message (see above).

Here are a few examples to set the profile options:

Run the pipeline locally, using a global environment where all tools are installed (build by conda for instance)

-profile path --globalPath /my/path/to/bioinformatics/tools

Run the pipeline on the cluster, using the Singularity containers

-profile cluster,singularity --singularityPath /my/path/to/singularity/containers

Run the pipeline on the cluster, building a new conda environment

-profile cluster,conda --condaCacheDir /my/path/to/condaCacheDir

For details about the different profiles available, see Profiles.

Sample plan

A sample plan is a csv file (comma separated) that lists all the samples with a biological IDs. The sample plan is expected to contain the following fields (with no header):

SAMPLE_ID,SAMPLE_NAME,path/to/R1/fastq/file,path/to/R2/fastq/file (for paired-end only)

Design control

A design file is a csv file that provides additional details on the samples and how they should be processed. Here is a simple example:

SAMPLEID,CONTROLID,GROUP
A949C08,A949C02,1
...

Genome annotations

The pipeline does not provide any genomic annotations but expects them to be already available on your system. The path to the genomic annotations can be set with the --genomeAnnotationPath option as follows:

nextflow run main.nf --samplePlan mySamplePlan.csv --design myDesign.csv --genome 'hg19' --genomeAnnotationPath /my/annotation/path --outDir /my/output/dir

For more details see Reference genomes.

Full Documentation

Credits

This pipeline has been written by

Contacts

For any question, bug or suggestion, please use the issue system or contact the bioinformatics core facility.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
assets		assets
bin		bin
conf		conf
docs		docs
geniac @ eae761c		geniac @ eae761c
modules/fromSource		modules/fromSource
recipes		recipes
test		test
.gitmodules		.gitmodules
CHANGELOG		CHANGELOG
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nextflow pipeline

Introduction

Pipeline summary

Quick help

Quick run

Run the pipeline on a test dataset

Run the pipeline from a `sample plan` and a `design` file

Defining the '-profile'

Run the pipeline locally, using a global environment where all tools are installed (build by conda for instance)

Run the pipeline on the cluster, using the Singularity containers

Run the pipeline on the cluster, building a new conda environment

Sample plan

Design control

Genome annotations

Full Documentation

Credits

Contacts

About

Releases 20

Packages

Languages

License

bioinfo-pf-curie/geniac-template

Folders and files

Latest commit

History

Repository files navigation

Nextflow pipeline

Introduction

Pipeline summary

Quick help

Quick run

Run the pipeline on a test dataset

Run the pipeline from a sample plan and a design file

Defining the '-profile'

Run the pipeline locally, using a global environment where all tools are installed (build by conda for instance)

Run the pipeline on the cluster, using the Singularity containers

Run the pipeline on the cluster, building a new conda environment

Sample plan

Design control

Genome annotations

Full Documentation

Credits

Contacts

About

Resources

License

Stars

Watchers

Forks

Releases 20

Packages 0

Languages

Run the pipeline from a `sample plan` and a `design` file

Packages