Entering edit mode
6.7 years ago
Sharon
▴
610
Hello everyone
MuTect2 from GATK3 discarded 87.33%
of the reads. This is during preparing panel of normals.
I will use sp1.vcf in the PON creation !
sp1.vcf: sp1.bam
java -jar ${GATK}/GenomeAnalysisTK.jar \
-T MuTect2 \
-R ${hg38}.fasta \
-I:tumor sp1.bam \
--dbsnp ${DBSNP} \
--cosmic ${COSMIC} \
--artifact_detection_mode \
-o sp1.vcf
Does this seem okay to you? Is there anything I can do to fix non primary reads (42.77%)
and duplicate reads (44.56%)
?
INFO 00:57:17,188 MicroScheduler - 158885515 reads were filtered out during the traversal out of approximately 181931389 total reads (87.33%)
INFO 00:57:17,190 MicroScheduler - -> 9210 reads (0.01% of total) failing BadCigarFilter
INFO 00:57:17,192 MicroScheduler - -> 81066454 reads (44.56% of total) failing DuplicateReadFilter
INFO 00:57:17,193 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 00:57:17,195 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 00:57:17,197 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
INFO 00:57:17,198 MicroScheduler - -> 77809851 reads (42.77% of total) failing NotPrimaryAlignmentFilter
INFO 00:57:17,200 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
------------------------------------------------------------------------------------------
Done. ------------------------------------------------------------------------------------------
This is how STAR log summary looks:
UNIQUE READS:
Uniquely mapped reads number | 29463720
Uniquely mapped reads % | 72.35%
Average mapped length | 271.00
Number of splices: Total | 21158459
Number of splices: Annotated (sjdb) | 21067563
Number of splices: GT/AG | 20952149
Number of splices: GC/AG | 126813
Number of splices: AT/AC | 5622
Number of splices: Non-canonical | 73875
Mismatch rate per base, % | 0.48%
Deletion rate per base | 0.02%
Deletion average length | 1.27
Insertion rate per base | 0.01%
Insertion average length | 1.84
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 9507878
% of reads mapped to multiple loci | 23.35%
Number of reads mapped to too many loci | 355263
% of reads mapped to too many loci | 0.87%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 2.76%
% of reads unmapped: other | 0.67%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
Thanks dariober. I don't know it I should retain or not !
If you don't know, then you should probably not retain. There is a reason why they are not retained by default. You can consult GATK Best Practices for more info: https://software.broadinstitute.org/gatk/best-practices/workflow?id=11146