Question

GO Term - collapsing and evidence codes

0

Entering edit mode

9.8 years ago

oliver.tills ▴ 10

I have successfully been using both topGO and goslim in R for exploratory analysis of GO terms in a non-model transcriptome.

goSlim produces summary statistics and graphics describing the distribution of goTerms across the goSlim of choice. However, I want the goSlim allocation applied to each transcript in my analysis so that I can then use this information in subsequent analyses. I am sure that this should be a simple task, but I have played with both goProfile and the map2slim.pl GO tool and have still not achieved this. Can anyone suggest a way of doing this?

On a similar topic: I have GO annotations (produced from Trinotate) for my transcripts, but a lot of tools require the GO EvidenceCodes. How can I generate these? I have tried getEvidence() in R, but this seems appropriate only to extract EvidenceCodes from data which already contains them.

Help and advice is much appreciated,

Oli

annotation R RNA-Seq • 3.3k views

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by oliver.tills ▴ 10

Ram · Answer 1 · 2015-03-20

0

Entering edit mode

9.8 years ago

Dr Reema Singh ▴ 160

Hello Oliver

Evidence code in GO term used to weightage the go term to gene product association [http://geneontology.org/page/guide-go-evidence-codes]. I think you need to extract this information from other source like GO.db bioconductor package. Try using GO id with this http://www.bioconductor.org/packages/release/bioc/vignettes/annotate/inst/doc/GOusage.pdf [not sure whether works or not]. You can also try a quick and dirty way - by just assigning IEA [Inferred from electronic Annotation] to your transcript and change the format of your input according to your program of interest. But before assigning make sure to read about the basics of evidence code assignment [http://geneontology.org/page/evidence-code-decision-tree]

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by Dr Reema Singh ▴ 160

0

Entering edit mode

Reema,

Thanks for these suggestions. I had tried these approaches, but have not been able to extract the information I require. As you suggest however, I might resort to artificially assigning IEA to all my data as a cheat fix.

I am still completely stuck on how to 'slice' GO data at different levels so as to reduce, or condense, the number of annotations assigned to each transcript. goSlim must be doing this process to generate its output, but how do I get at these 'per transcript' data as opposed to a summary count table?

ADD REPLY • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by oliver.tills ▴ 10

0

Entering edit mode

Hello Oliver,

Could you please describe what is your output and summary count data? Might be just a sample output?

ADD REPLY • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by Dr Reema Singh ▴ 160