Hello everyone,
I want to ask if there is any reliable method to identify gene pairs (paralogs) with parent/daughter relationship. For example if duplication of gene A produces gene B, and duplication of gene B produces gene C; then by using gene A as input, I only want to detect gene B as paralog. In other words, I don't want to detect "gene A : gene C" pair as paralog.
I have seen that stringent blastp criteria (based on e value, alignment length, percent sequence identity) has been defined for paralog identification. Will using those criteria can help for this purpose?
Some studies have also used CD-HIT for paralog identification but what I understand is that CD-HIT will cluster all gene A, B and C as paralogs.
Reciprocal blast hit (RBH) approach is usually used for orthologs, but can it be used for paralog identification. I was wondering if RBH can help me here as reciprocal technique can help identify highly similar sequences.
I shall be very thankful for any help.