Hi all,
I have recently come across amino acid sequences with a lot of asterisks inside the sequences. For example, X*X*XXX*XXX
. Can anyone explain to me what it means? I understand that some sequences will have asterisk at the end to indicate stop codon. But for this case they occur multiple times. They don't seem like unknown amino acid because there are also Xs as well. Thanks a lot!
Depending on the prediction software, * in the middle of an ORF could also represent e.g. a UGA codon (stop/tryptophan/selenocysteine depending on the translation system).
To expand on that -
Normally, if you see lots of stop codons, that means you are either translating a non-coding region, or are in the wrong reading frame.