Hi Kamila,
This is Pauline Ng, creator of SIFT (please note new website located at: http://sift-dna.org). The answer would depend on what type of analysis you are doing.
Genome-wide: If you are doing a genome-wide analysis, for seeing what genes in a Drosophila species are affected and have evolved differently, then use GO ontology like GOrilla to find any patterns. Also, if it's a genome paper, use Ka/Ks to see if the genes which contain these predicted-deleterious predictions are under relaxed selection, or the gene family size to see if the substitutions occur in large gene families. Ka/Ks and gene family size can be obtained from Ensembl Biomart. I like scanning the list of genes manually, see if any patterns strike out at me, and then code it up accordingly.
Finding a specific gene: If you are looking for a specific gene (for example, looking for a disease gene, and you know the disease), then you would list the gene descriptions, and whether any are in OMIM (just use the checkboxes on the website), and then look through them manually to see if any are interested hits. This may or may not get what you want, if you know the disease name, then use tools like Endeavour where you can enter the disease, or genes involved in the disease, and then all your genes that were predicted to affect protein function, for prioritization of candidate disease genes. There are a class of tools out there that do this (Endeavour, G2D, etc.)
Hope this helps. Again, please check out our new website at http://sift-dna.org where I am adding the latest updates.
Best,
Pauline Ng
Thank you very much Pauline for such a detailed answer. I am working on the Genome wide analysis and will surely look at the Ka/Ks and gene family using Biomart. If I am not wrong this means may be by using Ka/ks we are also trying to figure if any candidate gene is psuedogene or not? or this will tell simply the selection pressure?
Hi Kamila,
In the strict sense, evolutionary biologists typically consider Ka/Ks ~ 1 as under neutral selection, and hence pseudogenes. Enough evolutionary time has to pass in order to get Ka/Ks ~1 -- if you compare your species to another species that's too closely related, there may not be enough genome changes to accurately measure Ka/Ks. Another way to tell pseudogenes is if there are frameshifting indels or stops compared to a functional gene.
Best, Pauline