Question

which factors help to select positive blastx result?

2

Entering edit mode

9.9 years ago

Kurban ▴ 230

Hello guys!

I have BLASTXed a query file (which contain more than 140000 nucleotide sequences) with a db file (which contain more than 1400 polypeptide sequences). got the blasted query sequences around 20000 (with e-value 3), then I change e-value to 5 still got more than 15000 aligned sequences. When I checked the identity(%) of them, there are more that 2000 sequences that their identity smaller than 30. when I analysis the blastx result which factors should I consider?

Should I select 3 or 5 for e-value?

What % of the identity could be the threshold for blastx result?

Thank you in advance.

blast • 1.7k views

ADD COMMENT • link updated 2.7 years ago by Ram 44k • written 9.9 years ago by Kurban ▴ 230

score 0 · Answer 1 · 2014-12-22

Lower e-values are always better. E-value is the probability that you're seeing a match just by chance, so e-value of 3 should be better (though I normally use e-values that are more near 1e-3 than 3).

I'd also suggest throttling identity at ~75%, but this depends on the query and the subject more than just the program.