I have a bunch of protein sequences (~3.000) that I wish to compare with reference protein sequences, and I want the result in HGVS format. Example:
ARNDCEQGHILKMFPSTWYV <=> ARNMCEQGHILKMFPSTYV => p.[Asp4Met; Trp18del]
I have written and rewritten routines for this, but with multiple deletions/insertions and/or frameshifts it gets complicated... What are my options?
Thanks.