I have successfully been using both topGO and goslim in R for exploratory analysis of GO terms in a non-model transcriptome.
goSlim produces summary statistics and graphics describing the distribution of goTerms across the goSlim of choice. However, I want the goSlim allocation applied to each transcript in my analysis so that I can then use this information in subsequent analyses. I am sure that this should be a simple task, but I have played with both goProfile and the map2slim.pl GO tool and have still not achieved this. Can anyone suggest a way of doing this?
On a similar topic: I have GO annotations (produced from Trinotate) for my transcripts, but a lot of tools require the GO EvidenceCodes. How can I generate these? I have tried getEvidence()
in R, but this seems appropriate only to extract EvidenceCodes from data which already contains them.
Help and advice is much appreciated,
Oli
Reema,
Thanks for these suggestions. I had tried these approaches, but have not been able to extract the information I require. As you suggest however, I might resort to artificially assigning IEA to all my data as a cheat fix.
I am still completely stuck on how to 'slice' GO data at different levels so as to reduce, or condense, the number of annotations assigned to each transcript. goSlim must be doing this process to generate its output, but how do I get at these 'per transcript' data as opposed to a summary count table?
Hello Oliver,
Could you please describe what is your output and summary count data? Might be just a sample output?