Hi, I am stuck with my blast result (outfmt 6). In first column i got the contig id;like second column also i got subject id. But i need subjectid protein and organism.
eg (tabular format)
blast output: (tabular format)
TRINITY_DN2892_c0_g1_i1 gi|743827192|ref|XP_010933525.1| 94.958 238 12 0 742 29 34 271 2.15e-148 433
TRINITY_DN2892_c0_g1_i2 gi|743827192|ref|XP_010933525.1| 95.062 243 12 0 730 2 34 276 7.07e-153 444
TRINITY_DN2873_c0_g1_i1 gi|743848015|ref|XP_010939248.1| 91.447 456 39 0 2 1369 5 460 0.0 850
TRINITY_DN2838_c0_g1_i1 gi|743771805|ref|XP_010916192.1| 90.344 901 85 2 170 2872 2 900 0.0 1504
TRINITY_DN2838_c0_g1_i1 gi|743771805|ref|XP_010916192.1| 91.176 238 21 0 2904 3617 898 1135 2.14e-137 456
etc...
**my expected output: (Tabular format)**
TRINITY_DN2892_c0_g1_i1 gi|743827192|ref|XP_010933525.1| PREDICTED: aspartic proteinase nepenthesin-2-like isoform X2 [Elaeis guineensis] 94.958 238 12 0 742 29 34 271 2.15e-148 433
TRINITY_DN2892_c0_g1_i2 gi|743827192|ref|XP_010933525.1| PREDICTED: aspartic proteinase nepenthesin-2-like isoform X2 [Elaeis guineensis] 95.062 243 12 0 730 2 34 276 7.07e-153 444
TRINITY_DN2873_c0_g1_i1 gi|743848015|ref|XP_010939248.1| PREDICTED: oligopeptide transporter 3-like [Elaeis guineensis] 91.447 456 39 0 2 1369 5 460 0.0 850
TRINITY_DN2838_c0_g1_i1 gi|743771805|ref|XP_010916192.1| PREDICTED: uncharacterized protein LOC105041089 isoform X1 [Elaeis guineensis] 90.344 901 85 2 170 2872 2 900 0.0 1504
TRINITY_DN2838_c0_g1_i1 gi|743771805|ref|XP_010916192.1| PREDICTED: uncharacterized protein LOC105041089 isoform X1 [Elaeis guineensis] 91.176 238 21 0 2904 3617 898 1135 2.14e-137 456
etc...
I have around one lakh sequences, so it is difficult doing again blast and manual editing. For that I have taken all the subject id and greped with standalone database and i got the name list in another text file .
eg textfile
gi|743771814|ref|XP_010916197.1| PREDICTED: uncharacterized protein LOC105041093 [Elaeis guineensis] gi|743771805|ref|XP_010916192.1| PREDICTED: uncharacterized protein LOC105041089 isoform X1 [Elaeis guineensis] gi|743771786|ref|XP_010916183.1| PREDICTED: uncharacterized protein LOC105041085 isoform X2 [Elaeis guineensis]
now i want to incorporate this text file to my blast result like my expected output (greping and replacing in second column ). Is der any script or awk command for doing this job ??
thank you
did you add "stitle" in your blast command ?