Hi,
Doing a blast+, using the default output ( -outfmt 0 ), there is a way to not save all the queries that doesn't hit ?
I want to eliminate from output the lines around No hits found , I think it's not hard to do this using python against the blast+ output, but if the program itself has a native option to do that, it would save a considerable amount of time/processing and space.
Just to be crystal clear, that is the part I want to expunge from output:
Query= FCC1WLUACXX:8:1306:8508:25826#GAACGTGA/1
Length=100
***** No hits found *****
Lambda K H
1.33 0.621 1.12
Gapped
Lambda K H
1.28 0.460 0.850
Effective search space used: 8650833807
Thank you all in advance.
Do you need the standard BLAST output? If not, choose one of the other output formats (e.g. tabular), where only hits are reported. Otherwise, you'll need a parser, which exists in (Bio)Python and many other languages.
i don't know if you can fix that by parameter settings. but you can probably pipe the blast output through a custom utility to cut off the empty output (you can iterate over hits in BioPerl) and then to zip the stream in order to save some space.
That was the solution I ended up using, I reduced my output from 2.8gb to 1.8gb using a python script. Sadly it took a few extra time to process, something like 620 minutes VS 640 minutes.
if you have a huge dataset you may produce both outputs, thedefault one and in a tabular format, you may use tabular one to make analysis and a standard one to see alignments etc
Is it possible to filter the standard BLAST output since then?