Greetings!
I'm currently trying to find candidate genes related to a certain pathway of my species of interest. My approach began by screening certain gene families after performing DGE on some datasets. After obtaining my initial candidates I want to compare them with other genes (from the same family) involved in similar reaction/roles.
My question is regarding the use of an outgroup sequence in this analysis. My understanding is that an outgroup should be a sequence (or sequences) that are outside the group of interest and more distantly related to the ingroup. So, since I'm studying a gene family and not orthologs genes I'm not sure If I should include a sequence from the same family but outside my taxa. Does this make sense and is there anything I should take into consideration while analysing my results?
Thanks in advance
What If I want to compare my candidates genes (for example from ABC family) with sequences from other species? Should I include an outgroup and do I go about it?
It is difficult to answer your questions any better than I already have given the vague information you provided.
If you are talking about ABC transporters, that is a big superfamily of proteins. Even if you were to make a tree of all ABC transporters from a single species, you would have outliers in it because the superfamily is very diverse. As to if you should be including other species, I don't know what you have already, nor do I know what you are trying to achieve.
Wouldn't a gene that isn't an ABC superfamily member from some distant species suffice as an outgroup candidate here?
Most definitely no. It goes without saying that a random protein from a different species (or even the same species) would be an outlier. The goal is to select a protein that is distantly related but belongs to a group. In other words, it should be a protein (or a group of proteins) that can be aligned to others.
Thank for the fast reponse. The ABC family is not the best example for this question indeed. The objective of my work is to find genes responsible for transport of secundary compounds. Hence I'm doing dendograms using a) my candidate genes protein sequences, b) the protein sequences of genes from that family that are already caracterized as transporters of secundary compounds and c) all the protein sequences from that family in arabidopsis.
I was on the fence of including outgroup in this analysis, as what I'm really looking for is if the genes from my species cluster with with other genes from b)