The userguide says:
But I find another explanation saying that,the -Z should be set to the size of genome times 2.
I am in a mess now. How should I set the -Z , and I also want to know what's the E-value?
The userguide says:
But I find another explanation saying that,the -Z should be set to the size of genome times 2.
I am in a mess now. How should I set the -Z , and I also want to know what's the E-value?
To calculate Z please visit this link and use the function
esl-seqstat my_reference.fasta*
E-score: The E-value is the statistical significance of the hit: the number of hits we’d expect to score this highly in a database of this size (measured by the total number of nucleotides) if the database contained only nonhomologous random sequences. The lower the E-value, the more significant the hit.
Z: This option ensures that the reported E-values are accurate.
I hope that his helps:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I believe the question about
-Z
has been answered already.As for the
E-value
, it is basically the number of matches you would expect to get purely by chance that are of a similar quality as the current match (or better). So if you have anE-value
of 1, you could expect to get up to one extra match (purely by chance) whose alignment to the query is as good as the match you're looking at currently.