Question

Translating Nucleotide MSA to Amino Acid MSA

1

Entering edit mode

10.5 years ago

weslfield ▴ 90

Hi, so I have a nucleotide mutliple sequence alignment that I would like to translate into an amino acid MSA based on the reading frame of a reference sequence in that alignment. Looking for the best way to do this, preferably a Biopython way. Thanks!

alignment msa sequence • 3.6k views

ADD COMMENT • link updated 3.0 years ago by Ram 44k • written 10.5 years ago by weslfield ▴ 90

0

Entering edit mode

Why is it better to translate a nucleotide sequence to an amino acid sequence for MSA?

ADD REPLY • link 4.8 years ago by mrinsmrids • 0

0

Entering edit mode

DNA codons can are redundant, amino acids are not.

ADD REPLY • link 4.8 years ago by ATpoint 86k

Ram · Answer 1 · 2014-10-01

not sure you are still interested but...this solution assumes that the nucleotide alignment is a codon alignment. If not, you will end up with a bunch of "X" as aminoacid

from Bio import SeqIO

with open("translated.fas","w") as out:
    for record in SeqIO.parse("alignment.phy","phylip"):  ##change this to whichever format
        sequence=[]
        for c in range(0,len(record.seq),3): change to 0, 1, 2 depending on the frame of the reference
            codon = record.seq[c:c+3]
            if "-" not in str(codon):
                sequence.append( str(codon.translate()) )
            elif str(codon)=="---":
                sequence.append( "-" )
            else:
                sequence.append( "X" )
        print >>out, ">"+record.id
        print >>out, "".join(sequence)