HI, all
I have a protein sequence and its orthologs. I want to detect if a segment(length ranges from 7 to 100) of that protein sequence also exists in its orthologs. That is to see, I want to know if it is conserved.
I used psiblast like this
psiblast -query proseg.fa -out proseg.out -db ortholog
But as some of the segments are very short, the resulted Evalue is very high. So sometimes I could even not detect the existence of that segment in its own protein.
I tried to enlarge "-inclusion_ethresh" and "-threshold" and 'set dbsize', it did return some hits. But the result is not well. And also I am not sure if the modified parameter is meaningful. Is there a formula to compute this?
Can you give me any advice on how to set the parameter or choosing other methods to do this?
Thank You!
Later, I find that add the following parameter "-evalue 20000 -matrix PAM30 -comp_based_stats 0" can solve this question which is the same solution used by psiblast in NCBI web site.
I'm a bit surprised by the PAM30 matrix. Can you post the link where you found the values?
Just click http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&BLAST_PROGRAMS=blastp&PAGE_TYPE=BlastSearch&SHOW_DEFAULTS=on&LINK_LOC=blasthome and choose "psiblast" and give a short sequence, likely shorter than 20 amino acids. Then run blast, in the result page, you will see after you find and click the "Edit and resubmit".