Entering edit mode
7.2 years ago
horsedog
▴
60
I got a bunch of genome sequences in the same fie named sequence.fasta but some of them have the exact same names, like this:
> Rhodobacter_sphaeroides_2.4.1_chromosome_2
ATGAGCTTTCCCCATTTCGCGGCCCTCTTCCGGCCCTCGCAGTTCTTCGGCATCCGCGGCGGCGTCCACCCCGAGACGCG
>Rhodobacter_sphaeroides_2.4.1_chromosome_2
GTGCAGGTGGTGCCGACCCAGTATCCGATGGGCTCGGAGAAGCATCTGGTGAAGATCCTGACCGGGCGCGAGACGCCGGC
Is there any way to detect those sequences with the same name and add suffix automatically, so i can distinguish. this is what i want:
> Rhodobacter_sphaeroides_2.4.1_chromosome_2.1
ATGAGCTTTCCCCATTTCGCGGCCCTCTTCCGGCCCTCGCAGTTCTTCGGCATCCGCGGCGGCGTCCACCCCGAGACGCG
> Rhodobacter_sphaeroides_2.4.1_chromosome_2.2
GTGCAGGTGGTGCCGACCCAGTATCCGATGGGCTCGGAGAAGCATCTGGTGAAGATCCTGACCGGGCGCGAGACGCCGGC
But for those who have unique names just leave them.
Thanks a lot!
before the name there is a ">" so it's like this
http://bioinf.shenwei.me/seqkit/usage/#rename