I have a file with multiple sequences and sequence id;
>LR24F01_Bbc10_2_15-53841001-53841229
atgcccgccccccgcgccgcccccccctctctcgct
>MT24F01_Bbc10_2_15-53841001-53841229
atgcccgccccccgcgccgcccaaccctctctcgct
>LP39F01_Bbc10_2_15-53841001-53841229
atgcccgctccccgcgccgcccaaccctctctcgct
...... etc
I want to find out if someone can help with a simple perl script or linux command line so I can get rid of this portion:-53841001-53841229 of the sequence id: I want my output to look like this:
>LR24F01_Bbc10_2_15
atgcccgccccccgcgccgcccccccctctctcgct
>MT24F01_Bbc10_2_15
atgcccgccccccgcgccgcccaaccctctctcgct
>LP39F01_Bbc10_2_15
atgcccgctccccgcgccgcccaaccctctctcgct
...... etc
Thanks.
Just a note for everyone: when lines begin with ">", BioStar formats the text as a blockquote. This can confuse people giving answers, since they don't realise that the sequence was supposed to be in fasta format. If you indent lines with 4 spaces, fasta displays properly.