At first, I thought this question would answered by the "qcovs" field, but a glance at the results proved that that isn't the case. To begin with, each qcovs value relates not to the original query, but a smaller query partitioned therefrom. And I don't even know what this number actually means for those mini-queries. "Query Coverage Per Subject" is what the manual says, but apparently they use it in a different sense from what I would normally understand.
Second, "length" is supposed to be "length of alignment," but I'm now sure what that means, either. It's neither the length of the mini-query (qend-qstart+1) nor that of the corresponding subject, although there's a strong correlation between the three.
My purpose is to see whether the genome assembler succeeded in putting together a conserved gene of interest. As a measure of how well of each original (unpartioned) gene query is assembled, I'm think of either:
max([set of "nident" from all mini-queries based on the same original query])/original query length
or
max([set of "length" from all mini-queries based on the same original query])/original query length
Which one, if any, is the right approach? Please feel free to suggest your own, although I would appreciate an explanation of what I got wrong. An elucidation of "qcov" and "length" would be nice, too. Thank you.
Please check my recent comment in another thread.
C: BLAST definition and difference between 'qcovs' and 'qcovhsp'