I am running blastn on some nucleotide data, and it seems to run indefinitely when I generate XML output. The jobs take ~15 minutes when generating either the default format or tab delimited, but when I choose XML format each job maxes out the 3 hour cap I have set on it. I find it hard to believe XML generation would increase the job length sixfold so I figure there is a problem somewhere. Has anyone run into this?
I am using BLAST 2.2.24+.
Thanks
EDIT: Here are some example commands:
Working (archive format):
/uaopt/ncbi/2.2.24+/bin/blastn -outfmt 11 -db beij \
-query $home'datasets/main/KCmeta.fna' \
-out '/scr3/bmf/results/referencealignment/blast/beij_archive.asn'
Not working (xml format):
/uaopt/ncbi/2.2.24+/bin/blastn -outfmt 5 -db beij \
-query $home'datasets/main/KCmeta.fna' \
-out '/scr3/bmf/results/referencealignment/blast/beij.xml'
Also working are the default format, tab delimited, tab delimited w/ comments, & CSV.
I realized I have access to 2.2.24+ but the results are the same, I'd prefer not to need 2.2.25 since this is in a high performance computing lab where I have to request that it be installed.
Please post one exact command that works and one that does not.
Just a thought: you're not running out of disk space?
Using the XML format usually generates a lot of data, how many sequences are your running against which database? Like Michael asked, please show us the parameters you've used.
Please try again with the latest version 2.2.25.