Skip to content

jdlogicman/jigsaw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 

Repository files navigation

Jigsaw Scripts

These scripts are designed to work on the San Diego and San Francisco clusters.

The tools are installed in /illumina/scratch/Jigsaw/tools/jigsaw.

If you need to make any changes, a pull request is appreciated so I can keep track of things. If you clone your own copy, the pipeline will use the scripts in your source tree, so you can feel safe making related changes to several scripts.

Usage

In general, most of the scripts require 2 parameters:

-r : run folder
-o : output folder

Example:

jigsaw/assemble_flowcell -r /illumina/scratch/Jigsaw/NMP/NMP_Seq_Runs/MyFlowcell \
  -o /illumina/scratch/Jigsaw/Assemblies/MyAssemblies

Other parameters are usually just for internal helper scripts. Once exception is -s for setting the scratch root directory. By default, it is /scratch for SGE jobs and output/_scratch for interactive jobs.

The output folder may not be a subdirectory of the run folder - this makes it difficult to sync to local scratch space, and it's a bad habit to mix inputs & outputs anyway.

The scripts

assemble_flowcell

This script takes the primary parameters and submits an SGE job to assemble the flowcell. The output from the job is captured in the output directory you specify, and you will be emailed when the job completes.

scripts/assemble_flowcell

This is the script submitted to SGE. If you want to run the whole pipeline interactively, and you have a whole node, you can run this.

scripts/align_samples

Uses Isis/BWA to align the samples to the references provided in the sample sheet. Also creates the FASTQ files with adapters trimmed and reads reverse-complemented.

The results are placed in output/Alignment.

scripts/assemble_samples

Uses SPAdes 3.3.1 to assemble the FASTQ files. The results are placed in output/spades/SampleID.

scripts/generate_all_metrics

Uses various tools to create metrics. The following folders are placed in output:
  • picard/SampleID
  • quast/SampleID
  • visualization/SampleID

Visualizations

Files called
EC.report.html
,
BC.report.html
, and
RS.report.html
will appear in the directory above the output directory. These will display per-organism metrics and link out to other reports.

The visualization directory will contain a version of the assembly scaffolds aligned back to the reference genome. It will also contain a BED file called gaps.bed that can be used in IGV to quickly locate regions of the reference not covered by any contigs in the assembly.

Genomes

The reference genomes we use are under
/illumina/scratch/Jigsaw/genomes
. Only B. Cereus is the same as what is in iGenomes. If you use IGV to inspect the alignments and assemblies, you must use the reference genomes from this location.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published