Hi all,
Is there any script or tool which is able to parse NCBI blast xml output (produced with -m 7 option) ?
I want a tab delimited file containing the following information:
Name of the query sequence Seq1
2. Length of the query sequence 30
3. Name of target sequence gnl|BL_ORD_ID|0
4. Length of target sequence 5528445
5. Alignment bit score 59.96
6. E-value 8.38112e-11
7. Start of alignment within query 1
8. End of alignment within query 30
9. Start of alignment within target 5436010
10. End of alignment within target 5436039
11. Query frame 1
12. Target frame 1
13. Number of identical bases within 29
the alignment
14. Alignment length 30
15. Aligned portion (sequence) of query CGGACAGCGCCGCCACCAACAAAGCCACCA
16. Aligned portion (sequence) of target CGGACAGCGCCGCCACCAACAAAGCCATCA
17. Midline indicating positions of ||||||||||||||||||||||||||| ||
matches within the alignment
Thanks.
Elzed
And the minor ones, too! http://hackage.haskell.org/packages/archive/bio/0.5.0.1/doc/html/Bio-Alignment-BlastXML.html
Thanks Neilfws. I got the XML files, which are required by other softs for annotation and it contains millions of sequences, so i do not want to wait for weeks by redoing blast with -m 8/9.