Hi all, I'm currently using standard alone blast, the version is ncbi-blast-2.2.25+. I tried to made xml format blast result for my downstream analysis, but I had some problems in generating it.
The command I use is:
"blastx -query shorttest -db ~/blastdb/nr -evalue 1e-3 -max_target_seqs 20 -num_threads 24 -outfmt 5 > t.xml"
The command keeps running but nothing has been written to t.xml, and if I use other output format like default, or -outfmt 6, it will print out the correct result.
Thanks very much!
EDIT:
Yes it's transcriptome data, and I was trying to annotate the whole transcriptome through blast2go, and the first step I need to do is doing blastx against nr database.
I understand there might be stream for blastx, but I can wait for 24 hours and get nothing from it, while if I use other option like "-outfmt 6" I can get results in minutes, so it's likely stream is not the issue.
And I tried to use a small query against nr database, or keep the same query but against small local database, both gives me results, so I'm not sure whether it's the bulky size of both query and database and the format that cause the problems, thanks very much!
EDIT2:
thanks~ My doubt about the xml format is that it waits until everything finishes and then print out the results, since it has tags from the start and end, since my files are too big that they can't finish in one day, which is the time quote for our single job on the server, and as a result it print out nothing.
Any suggestions will be appreciated.
EDIT3:
Guys, thanks all for the suggestions, finally I decide to adopt the strategy that firstly I use "-outfmt 11" to generate standard asn file, and then use blast_formatter which is given by NCBI, to convert the result into xml format. I tested this on small sample result, it works fine, hope it works for my whole data set.
Have you tried running it with "-out t.xml" instead of "> t.xml"?
Philipp is right, you should use -out instead of redirection. If you need detailed command option, please type "blastx -help"
First things first : is your data made of NGS reads? If so, see my previous comment.
If not, I don't think your BLAST search will be finished in 24 hours, so independently of the output format, you should talk with your system administrator on this job time limit.
Finally, concerning XML format and other output formats, I don't know how XML output is written to the disk but I am 100% positive that outher outputs are kept in RAM and then written in batches. BLAST doesn't constantly stream output and it does not also write all the output in one go. This means, you should double check that your query is complete in both XML and other outputs, as having some info in the output file is not synonym of a completed query.