A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
/home/fchen3357/nf_work/50/1f8e8df4ca868840a97cab623ba241/SRR27410792.bowtie2.log/home/fchen3357/nf_work/50/1f8e8df4ca868840a97cab623ba241/SRR27410792_sorted_dedup.txt/home/fchen3357/nf_work/50/1f8e8df4ca868840a97cab623ba241/SRR27410792.flagstat/home/fchen3357/nf_work/50/1f8e8df4ca868840a97cab623ba241/SRR27410792.stats/home/fchen3357/nf_work/50/1f8e8df4ca868840a97cab623ba241/SRR27410792.insert_size_metrics.txt/home/fchen3357/nf_work/50/1f8e8df4ca868840a97cab623ba241/SRR27410792.peak_stats.tsv/home/fchen3357/nf_work/50/1f8e8df4ca868840a97cab623ba241/SRR27410792.frip.tsv
General Statistics
| Sample Name | Insert Size | Mean Insert Size | Duplication | Error rate | Non-primary | Reads mapped | % Mapped | % Proper pairs | % MapQ 0 reads | Total seqs | Mean insert | Reads | Reads mapped | % Reads mapped | % Aligned |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SRR27410792 | 0.13% | 0.0M | 0.6M | 100.0% | 100.0% | 0.0% | 0.6M | 144.5bp | 0.6M | 0.6M | 100.0% | 91.7% | |||
| SRR27410792_final_sorted | 112bp | 148bp | |||||||||||||
| sorted | 12.5% |
Picard
Tools for manipulating high-throughput sequencing data.http://broadinstitute.github.io/picard
Insert Size
Plot shows the number of reads at a given insert size. Reads with different orientations are summed.
Mark Duplicates
Number of reads, categorised by duplication state. Pair counts are doubled - see help text for details.
The table in the Picard metrics file contains some columns referring read pairs and some referring to single reads.
To make the numbers in this plot sum correctly, values referring to pairs are doubled according to the scheme below:
READS_IN_DUPLICATE_PAIRS = 2 * READ_PAIR_DUPLICATESREADS_IN_UNIQUE_PAIRS = 2 * (READ_PAIRS_EXAMINED - READ_PAIR_DUPLICATES)READS_IN_UNIQUE_UNPAIRED = UNPAIRED_READS_EXAMINED - UNPAIRED_READ_DUPLICATESREADS_IN_DUPLICATE_PAIRS_OPTICAL = 2 * READ_PAIR_OPTICAL_DUPLICATESREADS_IN_DUPLICATE_PAIRS_NONOPTICAL = READS_IN_DUPLICATE_PAIRS - READS_IN_DUPLICATE_PAIRS_OPTICALREADS_IN_DUPLICATE_UNPAIRED = UNPAIRED_READ_DUPLICATESREADS_UNMAPPED = UNMAPPED_READS
Samtools
1.23
Toolkit for interacting with BAM/CRAM files.http://www.htslib.orgDOI: 10.1093/bioinformatics/btp352
Percent mapped
Alignment metrics from samtools stats; mapped vs. unmapped reads vs. reads mapped with MQ0.
For a set of samples that have come from the same multiplexed library, similar numbers of reads for each sample are expected. Large differences in numbers might indicate issues during the library preparation process. Whilst large differences in read numbers may be controlled for in downstream processings (e.g. read count normalisation), you may wish to consider whether the read depths achieved have fallen below recommended levels depending on the applications.
Low alignment rates could indicate contamination of samples (e.g. adapter sequences), low sequencing quality or other artefacts. These can be further investigated in the sequence level QC (e.g. from FastQC).
Reads mapped with MQ0 often indicate that the reads are ambiguously mapped to multiple locations in the reference sequence. This can be due to repetitive regions in the genome, the presence of alternative contigs in the reference, or due to reads that are too short to be uniquely mapped. These reads are often filtered out in downstream analyses.
Alignment stats
This module parses the output from samtools stats. All numbers in millions.
Flagstat
This module parses the output from samtools flagstat
Flagstat: Percentage of total
This module parses the output from samtools flagstat
Bowtie 2 / HiSAT2
Results from both Bowtie 2 and HISAT2, tools for aligning reads against a reference genome.http://bowtie-bio.sourceforge.net/bowtie2; https://ccb.jhu.edu/software/hisat2DOI: 10.1038/nmeth.1923; 10.1038/nmeth.3317; 10.1038/s41587-019-0201-4
Paired-end alignments
This plot shows the number of reads aligning to the reference in different ways.
There are 6 possible types of alignment:
- PE mapped uniquely: Pair has only one occurence in the reference genome.
- PE mapped discordantly uniquely: Pair has only one occurence but not in proper pair.
- PE one mate mapped uniquely: One read of a pair has one occurence.
- PE multimapped: Pair has multiple occurence.
- PE one mate multimapped: One read of a pair has multiple occurence.
- PE neither mate aligned: Pair has no occurence.
Software Versions
Software Versions lists versions of software tools extracted from file contents.
| Software | Version |
|---|---|
| Samtools | 1.23 |