Hi, I was wondering if anyone could tell me if there's a way to get the tabbed BLAST output to show if there's not hit. I don't mind using either blastall or blast+.
The reason I ask is that I'm trying to track big BLAST jobs - so big that BLAST XML gets too big to store and parse (slow) using Biopython. BLAST XML show's if there's no hit though.
Thanks
I think I may have not explained correctly - I would like to be able to do this while the BLAST job is running.
Using comm the BLAST run would have to be complete.
but "comm" can use stdin (use '-' as one filename) and use 'tee' to save your original blast output in the pipeline
The point might be, while the blast process is still running, this approach cannot differentiate, whether there is a query sequence missing in the output because there was no hit or it hasn't been processed yet.
@Michael, but comm will only print the result after EOF or if the current query name is greater than the expected one (it's the same behavior than 'uniq' )
@Pierre - Can you explain this again. I don't quite understand.
Thanks - makes sense now.
This is a brilliant solution. One caveat though. I would use "sort -u" instead of uniq in the last command, because sometimes input.fasta file might not be in the sorted order, which causes blast output not be sorted either. so, (your blastallcommand) | tee redirect.result.txt | cut -d ' ' -f 1 | sort -u |\ comm -31 - ids.txt