These scripts are written to be used with version 0.9 or greater of the BaseSpace command-line interface (CLI).
Usage:
bash download_basespace_project.sh [options] -i|--id [Project ID]
Options:
-i, --id BaseSpace Project ID (integer)
-o, --output-path Path where the FASTQ files will be written
(Default: current working directory)
-c, --config BaseSpace CLI configuration
(Default: 'default')
The Project files are retrieved from BaseSpace and placed in the specified output path.
The output path and all of its contents are then set to read-only permissions to prevent accidental deletion.
Usage:
bash download_basespace_run.sh [options] -i|--id [Run ID]
Options:
-i, --id BaseSpace Run ID (integer)
-o, --output-path Path where the Run files will be written
(Default: current working directory)
-c, --config BaseSpace CLI configuration
(Default: 'default')
-e, --exclude Comma-separated list of extensions to exclude
(Default: 'jpg')
The Run files (except for any matching the specified set of extensions to exclude) retrieved from BaseSpace and placed in the specified output path.
The output path and all of its contents are then set to read-only permissions to prevent accidental deletion.
Usage:
bash download_basespace_run_metadata.sh [options] -i|--id [Run ID]
Options:
-i, --id BaseSpace Run ID (integer)
-o, --output-path Path where a tar file will be written
(Default: current working directory)
-c, --config BaseSpace CLI configuration
(Default: 'default')
This script produces the tarball:
[tarfile path]/[run name].tar.gz
where run name
is of the form:
[yymmdd]_[instrument ID]_[run index]_[A or B][flowcell ID]
This tarball contains all Run files except for those with the extensions:
bci bgzf filter jpg locs
The tarball is set to read-only permissions to prevent accidental deletion.
Note: this script assumes that it is in the same path as download_basespace_run.sh
.
Usage:
bash download_from_basespace.sh [options] -r|--run-ids [RunID1,RunID2,...] -p|--project-ids [ProjectID1,ProjectID2,...]
Options:
-r, --run-ids Comma-separated list of BaseSpace Run IDs (integers)
-p, --project-ids Comma-separated list of BaseSpace Project IDs (integers)
-t, --tarfile-path Path where a tarball of Run files will be written
(Default: current working directory)
-f, --fastq-path Path where directories of FASTQ files will be written
(Default: current working directory)
-c, --config BaseSpace CLI configuration
(Default: 'default')
This script performs the following tasks:
-
If Run IDs are specified, the script
download_basespace_run_metadata.sh
is used to retrieve the metadata for each Run, producing the tarballs:[tarfile path]/[run name].tar.gz
whererun name
is of the form:[yymmdd]_[instrument ID]_[run index]_[A or B][flowcell ID]
-
If Project IDs are specified:
- The script
download_basespace_project.sh
is used to retrieve the.fastq.gz
files for each Project. - The integrity of each
.fastq.gz
file is tested usinggzip -t
to ensure that it was correctly downloaded. - The flowcell ID of each
.fastq.gz
file is obtained, and the following directory is created for each flowcell ID:- If Run IDs are specified, and one of the Run names ends with the flowcell ID:
[FASTQ path]/[Run name]/
- If Run IDs are not specified, or none of the Run names ends with the flowcell ID:
[FASTQ path]/[flowcell ID]/
- If Run IDs are specified, and one of the Run names ends with the flowcell ID:
- The
.fastq.gz
files for each sample are concatenated across all lanes, e.g., the files:SampleX_S1_L00*_R1_001.fastq.gz
are concatenated (in order, by lane) into:[Flowcell-specific output path]/SampleX_S1_R1_001.fastq.gz
- All output files (tarball, flowcell-specific directories, and all FASTQ files) are set to read-only permissions to prevent accidental deletion.
Note: this script assumes that it is in the same path as download_basespace_run_metadata.sh
and download_basespace_project.sh
.
Usage:
bash upload_basespace_bcl2fastq.sh [options]
Options:
-b, --bcl2fastq-path Path containing bcl2fastq output
(Default: current working directory)
-c, --config BaseSpace CLI configuration
(Default: 'default')
A BaseSpace Project is created for each project-specific directory in the bcl2fastq
output, and all of the FASTQ files in those directories are uploaded to the corresponding Projects.