How to identify biallelic mutations called using WES data?
1
0
Entering edit mode
9 months ago
tanbiswas6 ▴ 10

Hi

I have called somatic mutations using GATK-Mutect2 pipeline and annotated the passed filtered VCF file using VEP annotator into a MAF format. I want to identify all the biallelic somatic mutations for the MAF file. Is it right if I consider that when the allele frequency is more the 0.5 then its biallelic? However, the annotated MAF file doesn't have any tumor variant allele frequency (t_VAF). If I calculate the t_VAF using the below formula:

VAF=t_altcount/(t_altcount + t_refcount)

It is showing up different numbers for example, for a mutation, the t_ref_count is 134 t_alt_count is 3 for my MAF file then the t_VAF would be 3.0.

So how do I know if a mutation is biallelic or monoallelic?

There are some other information in the MAF file like:

Reference_Allele    Tumor_Seq_Allele1   Tumor_Seq_Allele2   dbSNP_RS    Match_Norm_Seq_Allele1  Match_Norm_Seq_Allele2    G    G           A    rs1569861706    G   G

What does Tumor_Seq_Allele1 Tumor_Seq_Allele2 represents? I have checked in the GDC MAF file documentation but did not understand.

Please let me how to call a mutation biallelic or monoallelic?

Thank you.

Regards,

Tanay

mutect2 gatk wes mutation • 880 views
ADD COMMENT
1
Entering edit mode

Three comments

  1. "Is it right if I consider that when the allele frequency is more the 0.5 then its biallelic". Absolutely not. You are talking about many variants, and those AF are not good indicator in a multi-testing setting. More importantly, AF in tumor is heavily influenced by purity.
  2. "VAF=t_altcount/(t_altcount + t_refcount)" please note that MuTect2 count are haplotype count, not really allele count. I even see MuTect2 call variants when t-altcount = 0
  3. I am pretty sure all such MAF only indicate mutations in the loci but not it's biallelic or monoallelic. Furthermore, in the traditional filtering strategy, biallelic is considered high change of germline or artifacts, and will be filtered out. You have to build your own bi-allelic specific analysis and validations.
ADD REPLY
1
Entering edit mode
9 months ago
Michael 55k

You can use GATK SelectVariants for this:

 gatk SelectVariants \
 -R $REFERECE \
 -V $1-filtered.vcf \
 -O $1-filtered-selected.vcf \
 --select-type-to-include SNP \
 --select-type-to-include INDEL \ # adjust as you like
 --exclude-filtered true \ # when you are at it anyway...
 --remove-unused-alternates \
 --restrict-alleles-to BIALLELIC # this is main option for the task

And then pass only the filtered variants to VEP. This will also be more efficient because you are only annotating the variant you want to keep.

Otherwise you need to turn the maf file back into a vcf see How to convert maf to vcf while retaining original mutation notations?

Check if the tool does have an option to output annotated variants as VCF though.

ADD COMMENT
0
Entering edit mode

Hi Michael,

Thanks for the suggestion. I have run gatk SelectVariants in one of my filtered and passed variants VCF file. However, even after running it I'm getting the same number of variants. If I have both mono and biallelic variants in my previous VCF file, the after running SelectVariants I should get only the biallelic variants which will be lesser.

Does that mean I have only biallelic variants in my VCF file, even when I do not run SelectVariants options? Or there are some mistakes? Please let me know.

Thank you.

ADD REPLY
0
Entering edit mode

I think it would be quite reasonable to expect no more than 2 alleles present in somatic mutations in one sample. This may be especially true when the mutations were already filtered. Otherwise, these might be false positives.

See https://gatk.broadinstitute.org/hc/en-us/community/posts/360059271532-Mutect2-filter-somatic-mutations-as-multiallelic-and-clustered-events

If you want to double-check the results, you can use bcftools as an alternative: how to remove multiallelic from VCF But I believe it is likely that your ouput VCF already had no multiallelic sites.

ADD REPLY
0
Entering edit mode

Thank you Michael. I'll look into your suggestions.

ADD REPLY

Login before adding your answer.

Traffic: 2548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6