My input multiple sequence alignments contain stop codon in the middle (TAG", "TGA", "TAA) in few sequences. I want to remove all the sequences containing a stop codon in the middle but ignoring the last stop codon. Is there any biopython or other program available? (I am Ubuntu user).
My input
>A
ATGGCAGCAGATTCCAACTAA
>B
ATGGCTAAAGATTCCAACTAA
>C
ATGGCATAAGATTCCAACTAA
Output might be
>A
ATGGCAGCAGATTCCAACTAA
Thanks in Advance
Thank you so much, this is very perfect one🙂
Please accept the answer if its "solved".