Entering edit mode
3.7 years ago
harry
▴
40
I have large fasta file. As you see below there are > sign present in some fasta header like
>exon2_ENST00000218032|>exon2_ENST00000218032
>exon17_ENST00000253024|>exon17_ENST00000253024
I want to remove the >sign from the header sequence, after remove the header is then look like this
>exon2_ENST00000218032|exon2_ENST00000218032
>exon17_ENST00000253024|exon17_ENST00000253024
It is an actual fasta header that has only one > sign not more than 1.
>exon2_ENST00000218032|>exon2_ENST00000218032
GAAGCTTTTGGTCTATATTGTTAATTGCCATTGCTGTAAATCTTAAAATGAATGAATAAAAATGTTTCATTTTACAAAAAACATGTTCCTTCAGTCGTCAATGCTGACCTGCATTTTCCTGCTAATATCTGGTTCCTGTGAGTTATGCGC
>exon1_ENST00000218032|>exon1_ENST00000218032
GCTGCTGCAAGTTACGGAATGAAAAATTAGAACAACAGAAACATGGTTTCTCTTCTCGGCCACCTCCTGCATAGAGGGTACCATTCTGC
>exon1_ENST00000218032|exon2_ENST00000218032
AAGCTTTTGGTCTATATTGTTAATTGCCATTGCTGTAAATCTTAAAATGAATGAATAAAAATGTTTCATTTTACAAGTTTCTCTTCTCGGCCACCTCCTGCATAGAGGGTACCATTCTGCGCTGCTGCAAGTTACGGAATGAAAAATTAG
>exon17_ENST00000253024|>exon17_ENST00000253024
TCCCTGGTGGCCCCATCCCCCAGTTCCTCACGATATGGTTTTTACTTCTGTGGATTTAATAAAAACTTCACCAGTTACAAGGCAGACGTGCAGTCCATCATCGGCCTGCAGCGCTTCTTCGAGACGCGCATGAACGAGGCCTTCGGTGAC
>exon16_ENST00000253024|>exon16_ENST00000253024
TACAGCTCCCCACAGGAGTTTGCCCAGGATGTGGGCCGCATGTTCAAGCAATTCAACAAGTTAACTGAGACCAGCCCGGTGGCACCCTGGATCTGACCCTGATCCGTGCCCGCCTCCAGGAGAAGTTGTCACCTCCC
So please can anyone tell me how I remove the multiple > sign in my fasta header. Thanks in advance
based on OP format, it can be further simplified:
2g
in case of more>
in header.Thanks it working fine can you just help me how do i remove duplicate header with sequence as you can see ENST00000368129_1|ENST00000368129_3, It contain 2 sequence: I tried this command to remove duplicate sequence:
but it does not give right output: It give output like this
I don't want to lose any header information, so can you tell me what is wrong with this command.
Thanks in advance for your instant reply.
There is no problem with the command. Said command doesn't trim any thing. Check your input if it has any spaces in headers. Please post this as separate post with headers from fastaFileWithNoLinebreaksInSeq.