Good afternoon,
I have a list of 100 genes for which (in fact, for one of their transcripts in particular) I would like to get "synonymous coding" and "non-synonymous coding" SNPs that are observed in 1000G data (n=629).
Moreover, it would be fantastic to somehow extract the heterozygosity status for those SNPs.
I tried the ENSEMBLE 1000G browser, however, there are inconsistencies, that is, some SNPs that appear in the VCF file do not show up in the browser view. In addition, I do not want to mess with the dbSNP but am only interested in the SNPs observed in 1000G.
Any help would be much appreciated.
Do you have the VCF file describing the 1000G variants that you want to use?
The inconsistencies you see are probably caused by the fact that there are different 1000genomes releases. In particular, they have published a new one in October 2011, including almost 2000 individuals (http://www.1000genomes.org/announcements/october-2011-integrated-variant-set-release-ichg2011-2011-10-12). Which release are you interested to?