Entering edit mode
5.2 years ago
mxlsherry1992
▴
80
Hi,
Through series of analysis, I got several sequences for species A, I want to confirm if these genes (sequences) also existed in species B, so I want to use NCBI blast to do that. But it seems the default e value in NCBI blast is 1e-10, however, if we do blast in our own server, default e value is 1e-5. So I just want to know to confirm if these genes also in B species, what e value should I use?
Thank you!
if you want to know if the gene from A is in B, the E-value is not so important because the value depends on database size (your genome), it is better if you set some threshold related to similarity and coverage, but you need to estimate such values comparing the family, some gene families are highly conserved, others are really variable.
Finally, even with a good hit, you could need to validate the gene checking for typical domains or other gene characteristics.
You can't set a clear cut using p-value, similarity and length of overlap will give you a better insight. That's without even dwelling into the question of how to define two proteins as"similar"
Hi yes, p-value is just one of the parameter, but it needs an exact number. And actually, this step is just to further confirm these genes existed in species A, is not in species B, so I think that using these sequences to blast with B species genome could do that.