I got over 900 sequences in my result with 1e-5. However, other people's work showed 300-400 is the appropriate significant sequences number. Could someone please help me solve this problem?
I got over 900 sequences in my result with 1e-5. However, other
people's work showed 300-400 is the appropriate significant sequences
number. Could someone please help me solve this problem?
Your data does not need to show identical results as others. Results are a characteristics of the data going into the analysis. If your data was identical to what others have used (which I assume is not the case) then this would be a problem.
Hi,
Thank you for your response! I tested using the same proteome (i retrieved it from NCBI) which other people used in their paper, and the result differs a lot.
Most likely the reason for this problem is the same as in your other post: you are using -outfmt 6 instead of pairwise alignment. Since blast is a local aligner, it will often find multiple high-scoring pair segments (HSPs) between two proteins, rather than a single global alignment. If there are 3 HSPs between a query and its match, that counts as a single hit and will be shown as a single line in pairwise alignment output (though it will be shown as 3 alignments in the alignment part of the output). Since -outfmt 6 doesn't show alignments, that single hit will actually be shown as 3 lines. Even though you are asking only for a top hit with -max_target_seqs 1, it will often show multiple lines because of HSPs. As I suggested to you before, try removing -outfmt 6 from your command-line just to see how that output looks like.
Your data does not need to show identical results as others. Results are a characteristics of the data going into the analysis. If your data was identical to what others have used (which I assume is not the case) then this would be a problem.
Hi, Thank you for your response! I tested using the same proteome (i retrieved it from NCBI) which other people used in their paper, and the result differs a lot.