QC Report


general
Report generated at2021-08-28 17:27:53
TitleD2030.7_RW12302_youngadult_1_2
Descriptiongevirl
Pipeline versionv1.3.6
Pipeline typetf
GenomeWS245chr
Alignerbwa
Sequencing endedness{'rep1': {'paired_end': True}, 'rep2': {'paired_end': True}, 'ctl1': {'paired_end': True}}
Peak callerspp

Alignment quality metrics


SAMstat (raw unfiltered BAM)

rep1rep2ctl1
Total Reads180282361643109617930600
Total Reads (QC-failed)000
Duplicate Reads000
Duplicate Reads (QC-failed)000
Mapped Reads179971231639328617895186
Mapped Reads (QC-failed)000
% Mapped Reads99.899.899.8
Paired Reads180282361643109617930600
Paired Reads (QC-failed)000
Read1901411882155488965300
Read1 (QC-failed)000
Read2901411882155488965300
Read2 (QC-failed)000
Properly Paired Reads179212881633899817818884
Properly Paired Reads (QC-failed)000
% Properly Paired Reads99.499.499.4
With itself179850421638293417881358
With itself (QC-failed)000
Singletons120811035213828
Singletons (QC-failed)000
% Singleton0.10.10.1
Diff. Chroms8284523615144
Diff. Chroms (QC-failed)000

Marking duplicates (filtered BAM)

rep1rep2ctl1
Unpaired Reads000
Paired Reads809223574535458187868
Unmapped Reads000
Unpaired Duplicate Reads000
Paired Duplicate Reads140846811047761115625
Paired Optical Duplicate Reads141834107158128158
% Duplicate Reads17.405214.82219999999999913.625300000000001

Filtered out (samtools view -F 1804):


SAMstat (filtered/deduped BAM)

rep1rep2ctl1
Total Reads133675341269753814144486
Total Reads (QC-failed)000
Duplicate Reads000
Duplicate Reads (QC-failed)000
Mapped Reads133675341269753814144486
Mapped Reads (QC-failed)000
% Mapped Reads100.0100.0100.0
Paired Reads133675341269753814144486
Paired Reads (QC-failed)000
Read1668376763487697072243
Read1 (QC-failed)000
Read2668376763487697072243
Read2 (QC-failed)000
Properly Paired Reads133675341269753814144486
Properly Paired Reads (QC-failed)000
% Properly Paired Reads100.0100.0100.0
With itself133675341269753814144486
With itself (QC-failed)000
Singletons000
Singletons (QC-failed)000
% Singleton0.00.00.0
Diff. Chroms000
Diff. Chroms (QC-failed)000

Filtered and duplicates removed


Sequence quality metrics (filtered/deduped BAM)

rep1
rep1
rep2
rep2

Open chromatin assays are known to have significant GC bias. Please take this into consideration as necessary.


Library complexity quality metrics


Library complexity (filtered non-mito BAM)

rep1rep2ctl1
Total Fragments806106874115928113591
Distinct Fragments666277363183097014079
Positions with Two Read1021586852810883332
NRF = Distinct/Total0.8265370.852490.864485
PBC1 = OneRead/Distinct0.8205090.8470760.859421
PBC2 = OneRead/TwoRead5.351356.275836.824216

Mitochondrial reads are filtered out by default. The non-redundant fraction (NRF) is the fraction of non-redundant mapped reads in a dataset; it is the ratio between the number of positions in the genome that uniquely mapped reads map to and the total number of uniquely mappable reads. The NRF should be > 0.8. The PBC1 is the ratio of genomic locations with EXACTLY one read pair over the genomic locations with AT LEAST one read pair. PBC1 is the primary measure, and the PBC1 should be close to 1. Provisionally 0-0.5 is severe bottlenecking, 0.5-0.8 is moderate bottlenecking, 0.8-0.9 is mild bottlenecking, and 0.9-1.0 is no bottlenecking. The PBC2 is the ratio of genomic locations with EXACTLY one read pair over the genomic locations with EXACTLY two read pairs. The PBC2 should be significantly greater than 1. See more details at the ENCODE portal standard for ChIP-Seq pipeline


NRF (non redundant fraction)
PBC1 (PCR Bottleneck coefficient 1)
PBC2 (PCR Bottleneck coefficient 2)
PBC1 is the primary measure. Provisionally


Replication quality metrics


IDR (Irreproducible Discovery Rate) plots

rep1_vs_rep2
rep1_vs_rep2
rep1-pr1_vs_rep1-pr2
rep1-pr1_vs_rep1-pr2
rep2-pr1_vs_rep2-pr2
rep2-pr1_vs_rep2-pr2
pooled-pr1_vs_pooled-pr2
pooled-pr1_vs_pooled-pr2

Reproducibility QC and peak detection statistics

overlapidr
Nt448452777
N1248392464
N2263521811
Np359473082
N optimal448453082
N conservative448452777
Optimal Setrep1_vs_rep2pooled-pr1_vs_pooled-pr2
Conservative Setrep1_vs_rep2rep1_vs_rep2
Rescue Ratio1.24753108743427821.109830752610731
Self Consistency Ratio1.06091227505133071.3605742683600222
Reproducibility Testpasspass

Reproducibility QC


Number of raw peaks

rep1rep2
Number of peaks9957166835

Top 300000 raw peaks from spp with FDR 0.01

Peak calling statistics


Peak region size

rep1rep2idr_optoverlap_opt
Min size82.070.084.084.0
25 percentile330.0280.0336.0336.0
50 percentile (median)330.0280.0336.0336.0
75 percentile330.0280.0336.0336.0
Max size863.0502.0926.0926.0
Mean329.20796215765637279.44988404279195304.0908500973394333.7839223993756

rep1
rep1
rep2
rep2
idr_opt
idr_opt
overlap_opt
overlap_opt

Enrichment / Signal-to-noise ratio


Strand cross-correlation measures (trimmed/filtered SE BAM)

rep1rep2
Number of Subsampled Reads75522896954038
Estimated Fragment Length170155
Cross-correlation at Estimated Fragment Length0.6817372240782320.675781422532557
Phantom Peak5050
Cross-correlation at Phantom Peak0.67934210.6747082
Argmin of Cross-correlation15001500
Minimum of Cross-correlation0.67212980.668192
NSC (Normalized Strand Cross-correlation coeff.)1.0142941.011358
RSC (Relative Strand Cross-correlation coeff.)1.3320841.164704


Performed on subsampled (15000000) reads mapped from FASTQs that are trimmed to 50. Such FASTQ trimming and subsampling reads are for cross-corrleation analysis only. Untrimmed FASTQs are used for all the other analyses.

NOTE1: For SE datasets, reads from replicates are randomly subsampled to 15000000.
NOTE2: For PE datasets, the first end (R1) of each read-pair is selected and trimmed to 50 the reads are then randomly subsampled to 15000000.


rep1
rep1
rep2
rep2

Jensen-Shannon distance (filtered/deduped BAM)

rep1rep2
AUC0.402215450818674160.40142276945747934
Synthetic AUC0.498255212225895970.49820738625029015
X-intercept0.019102762924952150.019400152087612046
Synthetic X-intercept0.00.0
Elbow Point0.5730549451426770.5566486436858692
Synthetic Elbow Point0.5025778254785860.502173165396399
JS Distance0.078212269942805220.06790753646438286
Synthetic JS Distance0.151337163091857550.15095320199368573
% Genome Enriched37.7129376261159639.64197538256421
Diff. Enrichment10.3073619163795310.237824992151095
CHANCE Divergence0.088001742686610570.08713705527227694

Peak enrichment


Fraction of reads in peaks (FRiP)

FRiP for spp raw peaks

rep1rep2rep1-pr1rep2-pr1rep1-pr2rep2-pr2pooledpooled-pr1pooled-pr2
Fraction of Reads in Peaks0.48099537281895080.33759654824423440.33966050287801730.34543667513549870.33406136600234060.34759736062177730.47546103076177960.355437521072257730.34738416949458945

FRiP for overlap peaks

rep1_vs_rep2rep1-pr1_vs_rep1-pr2rep2-pr1_vs_rep2-pr2pooled-pr1_vs_pooled-pr2
Fraction of Reads in Peaks0.2539852182261380.16885590117070210.157142116841863360.2152894877865674

FRiP for IDR peaks

rep1_vs_rep2rep1-pr1_vs_rep1-pr2rep2-pr1_vs_rep2-pr2pooled-pr1_vs_pooled-pr2
Fraction of Reads in Peaks0.0362035830938813450.036324650455349510.0247552714549860.03871084645382909

For spp raw peaks:


For overlap/IDR peaks: