Hi Biostars,
I've got a list of putative enhancer elements that I've predicted based on conservation, epigenetic marks etc etc and a subset of these which are predicted to be rapidly evolving. I would like to do a Gene Ontology analysis on nearby genes. However, most GO tools take your input list of genes and reduce it down to a non-redundant set. This works for many types of datasets but not really for mine. I could in principle have multiple enhancer elements with the same closest genes. I don't think it would make much sense to consider (for example) 2 accelerated elements near Snx10 the same as 7 non-accelerated elements near Snx10. Allowing duplicate genes to be counted more than once in my GO analysis ought to more accurately represent the set of genes which enhancers in either accelerated or non-accelerated groups may be interacting with. Does anyone know of a tool that lets you do this?
Thanks!
That's a really interesting idea Carlo, I hadn't thought to rank GOrilla inputs like that.
Edit: Thinking about this though, even ranking still doesn't quite capture the frequency of GO terms associated with nearby genes. In principle, each GO term ought to be counted once for every occurrence of a nearby gene to accurately reflect the dataset.