Hello,
I am analyzing RNAseq data for the first time. It consists of two bacterial strains of the same species cultured both in 5 different conditions. Briefly, I trimmed fastq sequences and I mapped the reads against a reference genome (Hisat2). Then, the transcripts were annotated and quantified (StringTie). Raw gene counts were extracted using a python script provided by the authors of these packages. Differential gene expression was assessed through DEseq2.
Now, I would like to perform a GO enrichment analysis within R environment. I have seen various packages, such as GOseq, topGO and GOstats. However, it seems they require a database containing the ontologies of each gene for a given organism (p.e. org.Mm.eg.db for mus musculus).
I downloaded the GO annotations for my organisms from QuickGO online resource, which contains the gene name/locus linked to GO ontologies. This is in tab-format plain text but there is also the option of retrieving this information in .gaf and .gpad format (I do not know their use).
My question is how to create a .db object like org.Mm.eg.db from ontology annotations contained in QuickGO database to perform GO enrichment analysis.
Thank you very much,
Thank you! Actually, I was digging more into topGO. The link to the post doest work, it would be great if you can fix it to run some tests as a proof of concept!
Neuls : I fixed the link in e.rempel 's answer. It will work now.