Entering edit mode
4.8 years ago
ravi.eshwari
▴
10
Hi,
i have fasta files which look like this
>CATGTAGTGATTGATAGTGATA(1)
CATGTAGTGATTGATAGTGATA
>ATCCGTGAGCTTGAAGGATCCGCC(1)
ATCCGTGAGCTTGAAGGATCCGCC
>AAAACTACATATACATTCGGATT(2)
AAAACTACATATACATTCGGATT
>CCTGCATAGAGGATTCCGAAC(1)
CCTGCATAGAGGATTCCGAAC
>CATGAACAAGATGTTTGAGAACT(1)
CATGAACAAGATGTTTGAGAACT
I need to edit the header for each sequence and along with it print the sequences based on the abundance with unique header sequence
can someone kindly help me with this
How do i do this either in linux or shell scripting
Thank you!
And what have you tried so far?
i used the following command
it did change the header sequence but i now need to also print the sequence with more than one abundance with unique header
Please provide an example of the output you would expect - your question is unanswerable at the moment.
Edit it how? What abundance?
In addition: Technically the posted example qualifies for fasta format but having sequence repeated in the header is not going to make this easy to decipher.