Hi - in the current release
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/variants
I am interested in putting all variants annotated within humsavar.txt through Annovar or other annotation tools. Perhaps even score them with something like CADD. I require all the variants found in this file to be in chromosomal format. Elizabeth Gasteiger kindly pointed be in the direction of the uniprot link above and that I could map these via the via the homo_sapiens_variation.txt.gz file. (see https://www.biostars.org/p/187668/#187858)
I have merged these, and bearing in mind humsavar contains roughly 75k variants, when I merge on Protein identifier and amino acid change I only get 6252 entries.
even looking at rsid I can't find some in homo_spaiens_variation.txt.gz that are in humvar.
(try searching for rs11047499 in homo_sapiens_variation_.tx.gz ....... you won't find it.)
Was I wrong to think all variants in humvar orginate from the homo_sapiens_variations.txt.gz file? If they don't, the variants from humsavar must have come from a file (containing chromosomal co-ordinates) somewhere.
....I'd like to find such a file to do the mapping.
Thanks.
Hi @emmahe - Although the chromosome number and chromosome position are given in the annotation track bed file, the nucleotide allele change is not present. I perhaps should have explained this a bit better.
Thanks.