Parsing Blast Results In Java
2
0
Entering edit mode
11.2 years ago
weslfield ▴ 90

Hey guys, so I am in the process of creating a Java app with a GUI and all based on metagenomic analysis. So my question revolves around running BLAST searches and returning the results in the most efficient manner. I have been using Biojava3 for some other functionality in my software, but the BLASTing seems a little bit of an issue, especially with the most recent release (BioJava3 doesn't have a fully developed BLAST parser yet). For my purposes, I only really need the IDs of the hits to extract phylogenetic information from them, I don't need the sequences of the hits themselves. Can someone guide me to the best way to accomplish this? I have done it using Perl/bioperl very easily, but am struggling in Java. Thanks for the help and suggestions!

biojava parsing xml blast • 4.1k views
ADD COMMENT
4
Entering edit mode
11.2 years ago

generating a java parser for blast+XML ? easy:

${JAVA_HOME}/bin/xjc  -p blast -dtd "http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd"

and you're done.

See also:

BioJava and Blast+

How to perform a Blast Search from a Java Application?

ADD COMMENT
1
Entering edit mode
11.2 years ago
Neilfws 49k

The simplest solution would be to run BLAST with options which return simplified, easy-to-parse output (such as tab-delimited).

Assuming that you are running the newer BLAST+ programs, the option is:

-outfmt <string>

where string can take the values 0 - 11. Values of 6, 7 and 10 are tab-delimited, tab-delimited + comment lines and CSV, respectively. You can further customize delimited output using specifiers, e.g. 'sseqid' for subject sequence ID.

ADD COMMENT

Login before adding your answer.

Traffic: 3547 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6