Hello,
Has anyone tried to overlay human genomes variants with the 1000G or Hapmap or dbsnp. If so, what are the results like ?
For my human genomes, I tried to overlay using vcftools-
1) the solid data ( 3.2 million SNPs) with Hapmap (ftp://ftp.broadinstitute.org/bundle/2.3/hg19/hapmap_3.3.hg19.vcf ) :This gives me approx 22% overlay
2) the same sample sequenced with illumina (4.1 million SNPs) with Hapmap gives me 23%
Similarly, with dbsnp I get 54% for solid and 63% with illumina.
I am not sure if these results look good or I'm going wrong somewhere with my analysis.
Thank you.
by overlay, I mean, I wish to see how many of my variants are present (or in common) with Hapmap or 1000genomes.