Say I am given a protein U1 from the Uniprot database. And, according to UniProt's mapping data file, U1 maps to R1 in RefSeq's protein database. While U1's and R1's sequences are very similar, len(R1)>len(U1), I am guessing because R1 contains some extra region. What is an efficient way to align these two proteins? That is, I want to make len(U1)==len(R1), and the chunk that U1 is missing should be filled in with some empty symbol, e.g "-". Would I have to use some recursive segmentation algorithm?