I don't think there is an option to blastall/rpsblast which will fix this issue. This usage guide states, for the -b
flag, that "This is not the number of alignment segments or HSPs, since a given domain may have more than one portion aligned to the query."
You could get a list with only the top hit, ignoring the composite HSPs, by parsing the BLAST output. Using the SearchIO library from Bioperl, something like this should work:
#!/usr/bin/perl -w
use strict;
use Bio::SearchIO;
my $searchio = Bio::SearchIO->new(-file => "myblastfile", -format => "blast");
while(my $result = $searchio->next_result) {
while(my $hit = $result->next_hit) {
my @output = ($result->query_name, $hit->name, $hit->raw_score,
$hit->bits, $hit->significance);
print join("\t", @output), "\n";
}
}
This is just an example with some selected BLAST statistics (raw score, bit score etc.); see the documentation for how to access other parts of the BLAST report.
It's not really a bug. If there is no "best" HSP (since they're identical) then in effect, they are all the "top hit". If the raw output isn't what you want, the solution is to parse it.
Thanks for the advice.Same output can be retrived using blastall option m-7 which generates output in xml format and can be viewed using MS excel.However the redundancy still remains.I think this is bug with BLAST and should be improved.
I agree with peirre and neilfws that parsing is the only option to remove composite HSP's. Thanks for all your suggestions.