Hi, Guys I have a fasta file with thousands of sequences with peg IDs. Can anyone please help me to sort out the sequences in fasta file according to their peg ID....... Thanks in advance !!
.fasta file looks...
>fig|6666666.167416.peg.1
MYVAGHEGIELQPLSAADDAEARRLADEYFSRIDPAGR
>fig|6666666.167416.peg.3
MIAHASTPYSRKRGEPGPPHGPGLRRNEPHSGSTPFSL
>fig|6666666.167416.peg.2
MTHNQCLDLLESAEDTLDFLKSSLTYLGSPQKTENKAR
>fig|6666666.167416.peg.7
MHELQQALANLNTVLRRLDRNPAQYLLGGENIEETKP
>fig|6666666.167416.peg.4
MLLAAGRNRSARAAAREATGQDRCGGKRTGRKSGFPE
>fig|6666666.167416.peg.8
MLGHGRLQGSVRWRGAARKAVGGFLPSGLRPHHEISE
>fig|6666666.167416.peg.5
MDVRAPRAAPTGSDWRCRGRFSFFPRRSSFRSGARRL
>fig|6666666.167416.peg.6
MQIMVIEAMKEGADPEALLSSAQKVIDERTKELDKLD
The expected result should be like this :
>fig|6666666.167416.peg.1
MYVAGHEGIELQPLSAADDAEARRLADEYFSRIDPAGR
>fig|6666666.167416.peg.2
MTHNQCLDLLESAEDTLDFLKSSLTYLGSPQKTENKAR
>fig|6666666.167416.peg.3
MIAHASTPYSRKRGEPGPPHGPGLRRNEPHSGSTPFSL
>fig|6666666.167416.peg.4
MLLAAGRNRSARAAAREATGQDRCGGKRTGRKSGFPE
>fig|6666666.167416.peg.5
MDVRAPRAAPTGSDWRCRGRFSFFPRRSSFRSGARRL
>fig|6666666.167416.peg.6
MQIMVIEAMKEGADPEALLSSAQKVIDERTKELDKLD
>fig|6666666.167416.peg.7
MHELQQALANLNTVLRRLDRNPAQYLLGGENIEETKP
>fig|6666666.167416.peg.8
MLGHGRLQGSVRWRGAARKAVGGFLPSGLRPHHEISE
Thanks, Pierre.... I got linearised fasta with help of linearizefasta.awk. I am sry to say, Am new to awk. let me know what should I do next to get ouput like this....
The expected result should be like this :
And I can't understand this awk command. Where should I mention my input file in this command.......Thanks again.....looking forward......
You are giving your input file in the provided link
input.fa
. After replacinginput.fa
with your file name add the Pierre's answer (now you understand 3 dots -->linearizing fasta | answer below
)Dear Venu,
Thanks for your idea, but it's making problem for my data......it's not working proberly......will you suggest something else...
it's not working properly
- What is it showing? Any error on the screen?Yes,
It's Running.....but the output.fa file is empty...
But it's working properly for me. You are using linux right?
yes, am using ubuntu 14.04