Hi, I need to create a table with how e-values are distributed for some sequences, as a way of reporting how conserved the sequences are.
I got some inconsistent results, and boiled it down to if I query a sequence alone, or if I query it together with other sequences. The result page has a drop-down menu where you can only pick a single query sequence. So I assume that it is independent of the other query sequences?
Here is an example to show it. In the first test, I query ">1" alone, and the top hits are 4e-9. In the second test, I query ">1" together with ">0" and ">2", and when I look at only ">2", the top hits are "3e-9"
First test set-up:
First test results:
Second test set-up: Second test results:
I just did the same test with other sequences, and I got either 0.35 or 0.1 as the top hits. All settings are identical between the two searches. I just go to nucleotide BLAST, enter my queries, enter an organism, change to "blastn", and change the number of hits to 20.000. All other settings are the defaults.
So what is the correct way of doing this search? I'm so confused at the moment :<
Thanks for coming back and posting the solution.
I wonder how many blast searches similar to yours were affected over time. At least in your case, blast search result was not wrong.
e-values
anyway change over time as the underlying database probably changes each day.