How to get individual strand counts from Mutect2 for multi-allelic sites
0
0
Entering edit mode
5.2 years ago
nickeener ▴ 60

Hi all, I'm trying to pull strand count information from a VCF file made using GATK4's Mutect2. I used the following command to create this VCF:

gatk Mutect2 -I SRR8525881.bam -mbq 20 -R ../../genome/hxb2.fa --mitochondria-mode True -O SRR8525881.vcf

The VCF output for multi-allelic sites looks like this (I've emboldened the read depths and the strand count fields (strand counts are in this order: ref forward, ref reverse, alt forward, alt reverse)):

K03455.1 2042 . AG GA,GG . . DP=35;ECNT=8;MBQ=20,20,20;MFRL=182,166,90;MMQ=60,60,60;MPOS=56,45;OCM=0;POPAF=2.40,2.40;TLOD=35.34,20.29 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2:16,10,9:0.289,0.258:35:12,8,2:4,1,5:10,6,12,7

The strand counts given appear to be combined counts for both alternate alleles (12+7 = 10+9) so is there any way I can get strand counts for each alternate allele individually?

I've tried using GATK's VariantsToTable with the --splitMultiAllelic parameter and vcflib's vcfmulti program to split these multi allelic sites but get the following output for the same site:

K03455.1 2042 . AG GA 0 . DP=35;ECNT=8;MBQ=20,20,20;MFRL=182,166,90;MMQ=60,60,60;MPOS=56;OCM=0;POPAF=2.40;TLOD=35.34 GT:AD:AF:DP:F1R2:F2R1:SB ./0/1:16,10,9:0.289:35:12,8,2:4,1,5:10,6,12,7

K03455.1 2042 . AG GG 0 . DP=35;ECNT=8;MBQ=20,20,20;MFRL=182,166,90;MMQ=60,60,60;MPOS=45;OCM=0;POPAF=2.40;TLOD=20.29 GT:AD:AF:DP:F1R2:F2R1:SB ./0/1:16,10,9:0.258:35:12,8,2:4,1,5:10,6,12,7

As you can see, the alternate strand count info is the same combined total for both alleles.

GATK4 Mutect2 VCF • 1.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 2045 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6