Hello: I have a blast output with about 50000 sequence list. The problem I have is that I need to sort those results based on the smallest e-value to greater e-value. Could you guys please suggest me anything that I can resort to? Thanks! The blast output looks like this:
Tx_332323_c0_seq1 110815944 RecName: Full=Gag-Pol polyprotein; AltName: Full=Pr160Gag-Pol; Contains: RecName: Full=Matrix protein p17; Short=MA; Contains: RecName: Full=Capsid protein p24; Short=CA; Contains: RecName: Full=Spacer peptide p2; Contains: RecName: Full=Nucleocapsid protein p7; Short=NC; Contains: RecName: Full=Transframe peptide; Short=TF; Contains: RecName: Full=p6-pol; Short=p6*; Contains: RecName: Full=Protease; AltName: Full=PR; AltName: Full=Retropepsin; Contains: RecName: Full=Reverse transcriptase/ribonuclease H; AltName: Full=Exoribonuclease H; AltName: Full=p66 RT; Contains: RecName: Full=p51 RT; Contains: RecName: Full=p15; Contains: RecName: Full=Integrase; Short=IN 1446 5 457 107 1028 1138 4e-06 66 35.00 -3/0
Tv_333332_c0_seq2 206558251 RecName: Full=Zinc finger MYM-type protein 5 656 8 41 796 405 615 6e-14 92 27.63 2/0
Tv_144391_c0_seq1 75057844 RecName: Full=Nitrogen permease regulator 2-like protein; Short=NPR2-like protein; AltName: Full=Tumor suppressor candidate 4 380 4 1410 304 7 380 6e-118 65 48.27 -3/0
Ty_116400_c0_seq1 73920872 RecName: Full=Longitudinals lacking protein, isoforms F/I/K/T 970 1 29 190 902 956 4e-09 33 45.45 2/0
Ty_144400_c0_seq2 73920872 RecName: Full=Longitudinals lacking protein, isoforms F/I/K/T 970 1 29 190 902 956 2e-08 20 45.45 2/0
Tx_444402_c0_seq1 74834619 RecName: Full=Cathepsin L-like proteinase; Flags: Precursor 324 10 57 1034 1 323 9e-75 91 42.09 3/0
you know, there is a linux tool named 'sort': It sorts.
Thanks, but I tried that and gives faulty result.
Try this way: sort -k 10,10g input -o output.
Also, there is always the manual way...