Entering edit mode
5.6 years ago
860101959
▴
10
Recently, I received a fasta format sequence file from one of my colleges, But there are some strange characters like ^? in the sequence , does anyone knows why and how can I delete these characters? Because there are a lot of ^? in sequences and I don't want to delete manually.
I tried to recognize these characters using vim by \^\?
, ^?
or \^?
but failed. Since the data is output of MEGA maybe there is some reasons in there.
The sequence is like this:
^?MRATGEKRVLQLHELDEFCLDSYENAKIYKEKTERWHNRHIREKEIEVGQQVLMFNSHLKLFSGKLKSRWSGSFTVVAVFPHSKLERIAEDLLIE
apart from Pierre Lindenbaum remark, does the file contain the typical fasta header lines (starting with
>
followed by some text denoting the sequence ID/name) ?Thanks, I think it is encoding problem.
If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.