The docker image applied in this release is seglh/multiqc_v1.13:v1.4.0 and is stored in DNAnexus. This release is based on MultiQC v1.13 and incorporates a custom plugin to add support for som.py, exome depth and the TSO500 metrics.tsv file.
This app runs MultiQC to generate run wide quality control (QC) using the outputs from MokaAMP, MokaPipe, TSO500 and MokaWES pipelines including:
- Picard (CalculateHsMetrics, MarkDuplicates and CollectMultipleMetrics, TargetedPcrMetrics)
- FastQC
- bcl2fastq2
- bclconvert
- Peddy
- verifyBAMID
- Sentieon (duplication_metrics)
- som.py ^
- TSO500 metrics.tsv ^
- exome depth ^
- Sambamba_chanjo ^
^ provided by SEGLH plugin.
To generate QC reports, this app should be run at the end of an NGS pipeline, when all QC software outputs are available.
- project_for_multiQC - The name of the project to be assessed.
- This project must have a 'QC' folder in its root directory.
- coverage_level - Define which column to display from hsmetrics, reporting the percentage of target bases covered at the required depth (eg PCT_TARGET_BASES_20X)
- (optional) additional paths to locations of data to be included in the multiqc report, eg for bclconvert data.
Additional QC files are downloaded from other locations:
- 'Stats.json' is downloaded from /runfolder/Data/Intensities/BaseCalls/Stats if demultiplexing was performed by bcl2fastq v2.20 (or later)
- Any files with *metrics* in the name are downloaded. Sentieon apps output all files into the output folder, with inconsistent naming between apps. eg Duplication_metrics and duplication_metrics.
The following outputs are placed in the DNAnexus project under '/QC/multiqc':
- A HTML QC report (with the name of the runfolder) which should be uploaded to the Viapath Genome Informatics server.
- A folder containing the output in text format.
- The app downloads all files (recursively) within the QC/ directory of the project.
- The dx_find_and_download function is used to search for specific files, which are downloaded only if found.
- The app sets the minimum-fold coverage reported in the general stats table by editing 'dnanexus_multiqc_config.yaml'. This value comes from the app's 'coverage_level' input parameter.
- A dockerised version of MultiQC is used.
- MultiQC parses all files, including any recognised files in the report.
- The MultiQC outputs are uploaded to DNAnexus.
- The project which MultiQC is run on must be shared with the user mokaguys
- Only one value can be given to the coverage_level parameter, which may not be ideal for runs with mixed samples. Multiple reports may be required in these cases