Entering edit mode
7.3 years ago
Tamandua
•
0
I have a fasta file that includes sequences with identical names. I need to assemble contigs from these sequences using MIRA. However, Mira seems to have a problem with identical sequence names. Is there a way to solve this problem?
Are these sequences (with identical names) identical? if so, you can use fastx_collapser from fastx-toolkit or rmdup from seqkit. If not, append an unique string to identical ids.
No, they are not identical. I used
to get the best matching hits. Maybe there is a better way of doing this? And how can I rename identical ids?
example fasta with identical headers
Rename identical fasta ids:
output:
download seqkit from here. Output can be redirected to either fa or fa.gz with -o option.
Just rename all sequences - just google
rename fasta header
. There are dozens of ways, see two at this post. With awk:or with R:
Ah, thanks a lot! That's it!