automated gene ontology enrichment for simple gene list (not microarray data)
2
0
Entering edit mode
8.8 years ago
ruth.stoney ▴ 10

Hi,

I need to find an automated way to do GO enrichment for 3000 sets of genes. I've been working in R but problem I'm having is that the majority of the tools (topGO, goseq) accept microarray data and do not work for simple gene lists.

DAVIDWebService seems like a perfect solution, however I can't find a function to do actual enrichment analysis. It just seems to analyse/visualise existing enrichment files.

I am comfortable writing R and python (and could possibly get a Matlab licence) and would be willing to branch out if other tools are simple to use.

Thanks for any advice!

Ruth

gene • 3.2k views
ADD COMMENT
0
Entering edit mode

goseq works with lists of genes, not with microarrays!

ADD REPLY
1
Entering edit mode
8.8 years ago

Have a look at the clusterProfiler package in Bioconductor. It accepts a list of Entrez gene ids as input, and it allows to calculate both a simple enrichment and a gsea from Geneontology and other databases.

> m = enrichGO(as.character(c(1,2,3,4,5)) )
> summary(m)
                   ID                                  Description GeneRatio   BgRatio
GO:0019966 GO:0019966                        interleukin-1 binding       1/2   6/18679
GO:0019958 GO:0019958                      C-X-C chemokine binding       1/2   7/18679
GO:0019956 GO:0019956                            chemokine binding       1/2  15/18679
GO:0048306 GO:0048306            calcium-dependent protein binding       1/2  60/18679
GO:0019955 GO:0019955                             cytokine binding       1/2  83/18679
GO:0004867 GO:0004867 serine-type endopeptidase inhibitor activity       1/2  94/18679
GO:0002020 GO:0002020                             protease binding       1/2 103/18679
GO:0019838 GO:0019838                        growth factor binding       1/2 116/18679
GO:0004866 GO:0004866             endopeptidase inhibitor activity       1/2 168/18679
GO:0061135 GO:0061135             endopeptidase regulator activity       1/2 173/18679
GO:0030414 GO:0030414                 peptidase inhibitor activity       1/2 177/18679
GO:0061134 GO:0061134                 peptidase regulator activity       1/2 212/18679
                 pvalue    p.adjust      qvalue geneID Count
GO:0019966 0.0006423467 0.007493844 0.002760890      2     1
GO:0019958 0.0007493844 0.007493844 0.002760890      2     1
GO:0019956 0.0016054798 0.010703199 0.003943284      2     1
GO:0048306 0.0064141802 0.030955323 0.011404593      2     1
GO:0019955 0.0088674776 0.030955323 0.011404593      2     1
GO:0004867 0.0100397218 0.030955323 0.011404593      2     1
GO:0002020 0.0109983147 0.030955323 0.011404593      2     1
GO:0019838 0.0123821292 0.030955323 0.011404593      2     1
GO:0004866 0.0179076991 0.034295408 0.012635150      2     1
GO:0061135 0.0184381870 0.034295408 0.012635150      2     1
GO:0030414 0.0188624742 0.034295408 0.012635150      2     1
ADD COMMENT
1
Entering edit mode

Thank you, this works perfectly and is so simple!

ADD REPLY
0
Entering edit mode
8.8 years ago
Kamil ★ 2.3k

You might start by considering a function in the limma package called goana. See the examples in the documentation. The function can perform an enrichment test even if you only provide a vector or Entrez Gene IDs, without any other inputs.

See all the other packages available for Gene Set Enrichment at Bioconductor.

ADD COMMENT

Login before adding your answer.

Traffic: 2397 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6