Hi,
I have a simple question. Why the xml output format in blastn take a lot longer than the tabular output ?
There is a huge difference between the two output formats, using the same data and bank.
Thanks a lot.
Hi,
I have a simple question. Why the xml output format in blastn take a lot longer than the tabular output ?
There is a huge difference between the two output formats, using the same data and bank.
Thanks a lot.
Considering the volume of both output types, it could be that your filesystem is slow, for example it is a network drive, etc. also, it could be that blast writes out sensible chunks of XML - i.e. between opening and closing tag, so visually it could appear that it takes more time as for the tabular format these chunks are significantly smaller. In any case you may want to run some sort of well-designed profiling and see where it gets short.
for f in 5 6; do { time for i in {1..10}; do blastn -query NC_012808.1.fna -subject NC_012808.1.fna -outfmt "$f" > "$f"_file.tmp; done ; } 2> "$f"_time.txt; done
Here user+sys of 10x blastn of this genome against itself was ca. 37s for xml and 30s for tabular. I presume the difference is mainly related to IO. The xml output is 26M whereas the tabular output is 384K.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I doubt it takes that much longer. Could you perhaps provide more details?