Jigsaw Scripts

These scripts are designed to work on the San Diego and San Francisco clusters.

The tools are installed in /illumina/scratch/Jigsaw/tools/jigsaw.

If you need to make any changes, a pull request is appreciated so I can keep track of things. If you clone your own copy, the pipeline will use the scripts in your source tree, so you can feel safe making related changes to several scripts.

Usage

In general, most of the scripts require 2 parameters:

-r : run folder
-o : output folder

Example:

jigsaw/assemble_flowcell -r /illumina/scratch/Jigsaw/NMP/NMP_Seq_Runs/MyFlowcell \
  -o /illumina/scratch/Jigsaw/Assemblies/MyAssemblies

Other parameters are usually just for internal helper scripts. Once exception is -s for setting the scratch root directory. By default, it is /scratch for SGE jobs and output/_scratch for interactive jobs.

The output folder may not be a subdirectory of the run folder - this makes it difficult to sync to local scratch space, and it's a bad habit to mix inputs & outputs anyway.

The scripts

assemble_flowcell

This script takes the primary parameters and submits an SGE job to assemble the flowcell. The output from the job is captured in the output directory you specify, and you will be emailed when the job completes.

scripts/assemble_flowcell

This is the script submitted to SGE. If you want to run the whole pipeline interactively, and you have a whole node, you can run this.

scripts/align_samples

Uses Isis/BWA to align the samples to the references provided in the sample sheet. Also creates the FASTQ files with adapters trimmed and reads reverse-complemented.

The results are placed in output/Alignment.

scripts/assemble_samples

Uses SPAdes 3.3.1 to assemble the FASTQ files. The results are placed in output/spades/SampleID.

scripts/generate_all_metrics

Uses various tools to create metrics. The following folders are placed in output:

picard/SampleID
quast/SampleID
visualization/SampleID

Visualizations

Files called

EC.report.html

,

BC.report.html

, and

RS.report.html

will appear in the directory above the output directory. These will display per-organism metrics and link out to other reports.

The visualization directory will contain a version of the assembly scaffolds aligned back to the reference genome. It will also contain a BED file called gaps.bed that can be used in IGV to quickly locate regions of the reference not covered by any contigs in the assembly.

Genomes

The reference genomes we use are under

/illumina/scratch/Jigsaw/genomes

. Only B. Cereus is the same as what is in iGenomes. If you use IGV to inspect the alignments and assemblies, you must use the reference genomes from this location.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
scripts		scripts
README.md		README.md
assemble_flowcell		assemble_flowcell

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jigsaw Scripts

Usage

The scripts

assemble_flowcell

scripts/assemble_flowcell

scripts/align_samples

scripts/assemble_samples

scripts/generate_all_metrics

Visualizations

Genomes

About

Releases

Packages

Languages

jdlogicman/jigsaw

Folders and files

Latest commit

History

Repository files navigation

Jigsaw Scripts

Usage

The scripts

assemble_flowcell

scripts/assemble_flowcell

scripts/align_samples

scripts/assemble_samples

scripts/generate_all_metrics

Visualizations

Genomes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages