bcftools norm resulting in '*' in alternate allele
1
0
Entering edit mode
3.8 years ago
prasundutta87 ▴ 670

Hi,

After splitting multiallelic variants in my human multisample exonic germline VCF, the newly generated file contained many sites with '*' . The command I used is:

bcftools norm --check-ref w -f GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna -m -any exonic_variants.vcf.gz >bcftools_norm.vcf

The reason I am seeing this is because it is a spanning deletion (https://gatk.broadinstitute.org/hc/en-us/articles/360035531912-Spanning-or-overlapping-deletions-allele-#article-comments) and the input VCF file (generated using GATK) has this:

chr1    2503910 .       A       C,*

and it got split into:

chr1    2503910 .       A       C
chr1    2503910 .       A       *

My question is how do I treat this scenario? Should I just remove sites with a '*' in the alternate allele? What is the best practice here?

My general goto scenario is to only concentrate on high quality biallelic variants (SNVs) without normalising variants as multiallleic sites are generally considered to be sequencing errors (unless I want to study genetic mosaicism). Since thats not my aim in my current study, is it advisable to not normalise my VCF and directly move towards variant filtration? As in the current study, I also have indels, I can only consider biallelic indels (-v indels -m2 -M2) which removes these sites with '*'.

PS I am using the latest version of bcftools (v1.11)

SNP bcftools VCF exome • 1.8k views
ADD COMMENT
3
Entering edit mode
3.8 years ago

It is meaningless in a context where there is only one ALT allele so you can remove it. There is no lost alternate allele/variant, because '*' is the gap of an upstream indel.

ADD COMMENT
0
Entering edit mode

Thanks for this, Pierre!

ADD REPLY
0
Entering edit mode

I have a case where I have a phased VCF and * allele is being reported on the other allele and no upstream indel being reported:

#CHROM  POS         REF   ALT     GT
chr1     154590148   CG  C      0|1
chr1     154590149   G   *      1|0
chr1     154590149   G   C      0|1

see my question: Removing / Excluding / Collapsing Overlapping Indels

ADD REPLY

Login before adding your answer.

Traffic: 2500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6