VCF file lacks alternative allele listing for most rs ids
0
1
Entering edit mode
10.5 years ago
devenvyas ▴ 760

I have SNP data on 64 samples from my population of interest (~330,000 SNPs per sample using the HumanCNV370-Quad).

I sorted and filtered the published Altai Neanderthal and Denisovan VCF files (http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/VCF/ and http://cdna.eva.mpg.de/denisova/VCF/hg19_1000g/) down to only the rs#s found on my SNP data.

I then noticed a problem where in well over half of the SNPs for the Neanderthal VCF and a small percentage for the Denisovan VCF that the alternative base is not listed... When I go look up those SNPs in dbSNP or in the Denisovan VCF file, alternate alleles exist and are listed... Luckily, it seems that whenever the alt allele is not listed, they are always homozygous for the ref allele

Since these are ancient DNA calls, I will have filter out some types of substitutions, but I can't do that if the. I was wondering, how do I fix this?

Also, I plan on using vcf-isec to intersect the two files, I was wondering, how will the incongruous alt allele information affect this? Thanks!

-Deven

vcftools VCF dbSNP SNP • 3.1k views
ADD COMMENT
0
Entering edit mode

Can anyone assist with this? Why does this happen. I've ~6K snps with no alternate allele(s) on CHR22 for one of the callings I did. GrCH37, Homo Sapiens, Illumina hiseq 150 bp.

ADD REPLY

Login before adding your answer.

Traffic: 1842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6