Hi,
I'm looking for an easy way to retrieve all the genes in a list that are associated with a certain GO term, preferably using R/Bioconductor packages. I'm not interested in under/overrepresentation or enrichment.
For instance, I want a list of all genes known to be located in 'presynaptic endosome' (GO:009883).
I tried the method referred to in an older post (https://www.biostars.org/p/52101/); following is my code:
ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") #retrieve human.ensembl data via biomaRt
go_1 = c("GO:0098830", "GO:0098954", "GO:0099007", "GO:0099067", "GO:0099037", "GO:0098955", "GO:0099592", "GO:0099532") #GO IDs of all terms associated with "presynaptic endosome" (inclusive of child terms)
pre.gene.data <- getBM(attributes=c('hgnc_symbol', 'ensembl_gene_id', 'go_id', 'go_linkage_type'),
filters = 'go', values = list(go_1), mart = ensembl)
but my code doesn't filter the genes based on the GO terms. It gives me the following output:
GO:0006886 AP3D1 ENSG00000065000 intracellular protein transport IEA
GO:0016192 AP3D1 ENSG00000065000 vesicle-mediated transport IEA
GO:0030117 AP3D1 ENSG00000065000 membrane coat IEA
... and 49 other entries for the gene "AP3D1"
I'm unable to understand what might be the problem. I replaced the filter "go_id" to "go" because of the updated filters used:listFilters(ensembl).
Please help!
I think "filters" should match with attributes and headers of the data. So, if you're using "go" as filters, the header and attribute should also be "go" instead of "go_id", or you can use filters="go_id" to match.
Hi, Thanks for your suggestion. "go" is not a valid attribute name, and "go_id" is not a valid filter name. It shows me an error message: Invalid attribute(s): go Please use the function 'listAttributes' to get valid attribute names
Have you tried changing the headers of the files (go_id to go), and then get the list of attributes?
The link to the post in your original question doesn't work. Are you referring to this post?
annotation - biomaRt - getBM - multiple entrez ID
Another good resource:
https://www.stat.berkeley.edu/~sandrine/Teaching/PH292.S10/Durinck.pdf