Question

Homologous Sequence

0

Entering edit mode

11.9 years ago

bharat.85.monu ▴ 20

Hi,

I have a set of proteins and I need to search homologous partners for them. I wanted to automate the process of searching and I wrote a perl script for that. Now, the question is what should be the % of identity (both min and max) should I use in searching the homologous sequences. Also, the database that I using for blastp is PDB , since I need the structures only .. Kindly help out with this problem ??

Regards, BHARAT

sequence • 5.5k views

ADD COMMENT • link updated 11.9 years ago by Neilfws 49k • written 11.9 years ago by bharat.85.monu ▴ 20

2

Entering edit mode

Please take care with the word "homologous". It has a very specific meaning which you should understand.

ADD REPLY • link 11.9 years ago by Neilfws 49k

1

Entering edit mode

The PID that you choose varies for different purposes. Give us more details about your problem for answering your question.

ADD REPLY • link 11.9 years ago by aravind121 ▴ 70

0

Entering edit mode

My query is that I want to obtain homologous protein structures for my dataset ?? . Since the dataset is large I need some sort of cutoff values to identify the homologous sequences

ADD REPLY • link 11.9 years ago by bharat.85.monu ▴ 20

0

Entering edit mode

You can use Dali to make the identity scores from structural alignments.

ADD REPLY • link 11.9 years ago by Pappu ★ 2.1k

score 0 · Answer 1 · 2012-12-27

0

Entering edit mode

11.9 years ago

qiyunzhu ▴ 430

I recalls that 30% is an empirical cutoff in term of protein sequence similarity.

If you use BLAST, then E-value serves as a better indicator of homology, comparing to identity. Because E-value takes into account the lengths of query and subject sequences. For example, a short protein is more likely to be somehow similar to another random guy simply by chance, in which case, a high E-value speaks stronger than a high identity. As I know there's no standard cutoff for E-value. You can try from 1e-2 (IMG's protocol) to 1e-10 or even lower.

On the other hand, there are a bunch of programs to help you identify orthologs, using more sophisticated algorithms, such as OrthoMCL. You can try those...

ADD COMMENT • link 11.9 years ago by qiyunzhu ▴ 430

0

Entering edit mode

E-values give a measure of similarity, not homology...

ADD REPLY • link 11.9 years ago by Whetting ★ 1.6k

0

Entering edit mode

I know. But is there a better and practical way to measure "homology"?

ADD REPLY • link 11.9 years ago by qiyunzhu ▴ 430

0

Entering edit mode

I know what you mean. Homology is hard to show in a "high-throughput" manner. Just wanted to clarify for everyone reading the post!