Hello Mr Locuace,
this "ID" looks to me like it describes:
- the chromosome
- position
- reference allele
- alternative allele
- reference genome
So to get the corresponding rs id, one solution is to extract the informations of the id and convert it to a vcf file. You can then annotate the ID column. There was a similar thread some time ago.
Let's start to create a vcf file. I assume you have a file id.txt
which contain your id's in every line:
awk -F_ -v OFS="\t" '{print $1,$2,".",$3,$4,".",".","."}' id.txt|sort -k1,1V -k2,2g > ids.vcf
Now you can take this vcf file and annotate it with e.g. SnpSift:
java -jar SnpSift.jar annotate -id dbSNP.vcf.gz ids.vcf > ids_annotated.vcf
Of course first you have to download the dbSnp file.
fin swimmer
Can you give context on the IDs? How did you get them? What's the biological context?
Hi Hussain Ather, I just edited my post