GATK best practice variant calling pipeline implements with license server for Sentieon
.
├── Jobs
│ ├── run_Jobs.pl
│ ├── variant_calling.sh
│ ├── check_Jobs.sh
│ ├── _rc_function_.sh
│ ├── test_list.txt
│ ├── Outputs
│ └── logs
├── Fastq
├── Reference
│ └── hg19_download.sh
├── bins
│ └── sentieon-genomics-201808.tar.gz
└── README.md
git clone https://github.com/shanghungshih/sentieon.git
- decompress sentieon bin file
tar zxvf sentieon-genomics-201808.tar.gz
-
download reference and annotation data (recommad using
screen
rather thanbg
or&
)- open new screen :
screen -S download_hg19
, and runbash hg19_download.sh
, thenctrl+A+D
to detach - attach existed screen : get screen id with
screen -ls
, andscreen -r xxx
to attach, thenctrl+A+D
to detach - close existed screen : get screen id with
screen -ls
, andscreen -r xxx
to attach, thenctrl+D
to close it
- open new screen :
-
check
sentieon/Jobs/variant_calling.sh
, the followings are default setting:- SENTIEON_LICENSE :
export SENTIEON_LICENSE=xxx.xxx.xxx.xxx:xxx
- main directory :
main_dir="/home/shanghung/sentieon"
- fastq1 file name :
fastq_1="${fastq_folder}/${SampleName}_R1.fastq.gz"
- fastq2 file name :
fastq_2="${fastq_folder}/${SampleName}_R2.fastq.gz"
- number of threads :
nt=32
- update annotation files
- SENTIEON_LICENSE :
-
check
sentieon/Jobs/run_Jobs.pl
for changing max parallel numbers (depend on the total threads and $nt you set):- line36 :
$running_job < 2
- line36 :
-
put raw data of the sample to
sentieon/Fastq
in corresponding folder -
finally, run "one" sample before multiple submit for building reference index
- run jobs (suggest running via screen)
-i
: ID-File: the File contain Sample ID-s
: Start: From which lines (sample).-e
: End: To which lines (sample).
perl run_Jobs.pl -i test_list.txt -s 1 -e 1
- check jobs (if checkpoint found in Outputs, then return success)
- list_ID : read from command line
- checkpoint : "*.vqsr_SNP_INDEL.VQSR.pdf" (default)
bash check_Jobs.sh test_list.txt