Hi all, I have clustered around 30000 genes from roughly 90 genomes into orthologous groups by using orthoMCL. I want to check the evolutionary force guiding particular orthologous cluster. For that, I have selected longest sequence from each orthologus cluster (to increase phylogenetic spread) to be representative of that group and queried it in blast nr database to find homologs. My question is that if it is the right method to to find potential homologs by choosing one representative sequence (longest) from each group? or is there any other way by which we can choose a representative member of orthologous cluster?
Thanks in advance
regards
What exactly are you trying to do - why do you need to do that? The fact that you have a sequence in an orthologous group means that they should in theory at least all be as representative as one another..else they aren't really homologs right?!
You could use CD-HIT to cluster the ortholog groups and let it pick a representative sequence for you, but I'm still not 100% sure what the objective is here?
thanks for answer my goal is to prepare phylogenetic tree for each ortholog group by finding their homologs within nr database. for eg. if i have 50 genes in one OG, than blasting each gene individually in nr databsae to find homolog will be tedious and time consuming. SO, for that I want to select one representative member of each OG and blast it in nr.
Well strictly speaking it shouldn't really matter what ortholog from any given group you pick then, because if they are all sufficiently similar to one another (depends on what your ID cutoff was for inclusion as an ortholog) you'd expect that blasting any given ortholog in that group should return the same blast hits.
I have no idea whether you would expect to see much difference in picking the longest or shortest sequence in an ortholog group to blast though. I guess longer might hit more sequences in total. I like Jean-Karim's suggestion of making HMMs from the clusters though, so you encapsulate all their information.
Thanks for answering and removing my doubts