How to get all the genes (total count) annotated to particular GO Term?
3
0
Entering edit mode
9.6 years ago

I have some GO IDs. I want to see how many (total) genes are associated with particular GO Term. Could someone please help me in this?

gene • 4.1k views
ADD COMMENT
2
Entering edit mode
9.6 years ago
Uma A ▴ 230

You might want to take a look at Ensembl BioMart. GO related data is in the "EXTERNAL" section. Select all GO options if you need all the details of the associated GO term. The data would be available gene wise, i.e. multiple lines per gene for every GO term associated with it. Do not forget to select the "Associated gene name" option from the "GENE" section for getting the gene symbol, else you will only get the Ensembl transcript and gene IDs along with the GO terms, which I

I would have ideally recommended DAVID knowledgebase but it seems that the knowledgebase link is not available currently for some reasons. Please do try and check it out some time if it is still available. It gives all the genes-GO data in every possible combination, sorted by Biological Process,Molecular Function and Cellular Component properly in separate files.

After obtaining the files, you can count the total number of genes associated with the desired GO terms using a simple script.

ADD COMMENT
0
Entering edit mode
ADD COMMENT
0
Entering edit mode

Hi

Could you please give me the sample of how I can get all annotated genes to this GO:0006520?

ADD REPLY
0
Entering edit mode
$ curl -s  "http://www.ebi.ac.uk/QuickGO/GAnnotation?tax=9606&relType=IP&goid=GO:0006520&format=tsv" 

>>> 2
$1    DB    UniProtKB
$2    ID    A0A024R050
$3    Splice    -
$4    Symbol    SLC1A3
$5    Taxon    9606
$6    Qualifier    -
$7    GO ID    GO:0006536
$8    GO Name    glutamate metabolic process
$9    Reference    GO_REF:0000019
$10    Evidence    IEA
$11    With    Ensembl:ENSMUSP00000005493
$12    Aspect    Process
$13    Date    20150404
$14    Source    Ensembl
<<< 2

>>> 3
$1    DB    UniProtKB
$2    ID    A0A024R050
$3    Splice    -
$4    Symbol    SLC1A3
$5    Taxon    9606
$6    Qualifier    -
$7    GO ID    GO:0006537
$8    GO Name    glutamate biosynthetic process
$9    Reference    GO_REF:0000019
$10    Evidence    IEA
$11    With    Ensembl:ENSMUSP00000005493

...
ADD REPLY
0
Entering edit mode

Hi

How this information is telling the total number of genes annotated to this term?

ADD REPLY
0
Entering edit mode

This is basic-linux:

~$ curl -s  "http://www.ebi.ac.uk/QuickGO/GAnnotation?tax=9606&relType=IP&goid=GO:0006520&format=tsv" | cut -f 4,5  | sort | uniq | wc -l
362
ADD REPLY
0
Entering edit mode
9.6 years ago

Hi

I tried Biomart and used ensembl gene 72 database followed by yeast database (i am working on cervesiae). then selected what you said and found 31 genes associated with the GO term which i was looking but i crossed check with amigo (GOC) and found that there are 255 genes annotated to the term. I do not know why there is a difference?

ADD COMMENT
0
Entering edit mode

I tried to replicate your issue. But I am getting only 24 entries for GO:0006520 amigo (GOC) downloaded file named gene_association.sgd for Saccharomyces cerevisiae whereas there are 31 entries for the same GO term from Ensembl gene 79. Out of these 31 entries, 3 do not have an associated gene name. Hence, total unique gene names are 28 only. Also, the 24 genes obtained from amigo are a complete subset of these 28 genes from Ensembl.

ADD REPLY

Login before adding your answer.

Traffic: 2446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6