Are there any other databases I should be looking at to be able to determine how many times (if any) a variant has been discovered in other tumours, normals, etc ?
In terms of population allele frequencies for germline variants, 1000 Genomes, ESP 6500, and ExAC will be probably more than you need. For somatic variants, COSMIC is probably the easiest, but TCGA and ICGC are also possibilities. If you want databases of clinically annotated variants, ClinVar is your best bet. If you have an exorbitant amount of money sitting around, HGMD Pro is pretty good. If you really want to get detailed but have a high tolerance for headaches, there are many small datasets with clinically annotated variants (but not with genomic coordinates - those will have to be parsed out somehow). These include the LSDBs, Emory's EmVClass, Clinvitae, and many many more.
If you'd like to access any of these programmatically, you should check out SolveBio. We've imported most of these databases in into our Data Library.
Hope that helps
ADD COMMENT
• link
updated 2.4 years ago by
Ram
44k
•
written 9.6 years ago by
dandan
▴
370
1000 Genomes, ESP & ExAC would be pretty comprehensive for population allele frequencies. I haven't come across anything other than COSMIC for cancer related variants.