Dear GO SQL Experts,
I would like to create a GO annotation text file which contains the set of genes connected to each GOids and its all descendants (children, grandchildren, etc.).
Can somebody give me some hints how to do this? (I am working with D.melanogaster and FlyBase geneIDs - FBgn)
I already created an SQl query for this:
SELECT
term.acc AS superterm_acc,
dbxref.xref_key AS gp_acc,
evidence.code AS code
FROM term
INNER JOIN graph_path ON term.id=graph_path.term1_id)
INNER JOIN association ON (graph_path.term2_id=association.term_id)
INNER JOIN evidence ON (association.term_id=evidence.id)
INNER JOIN gene_product ON (association.gene_product_id=gene_product.id)
INNER JOIN species ON (gene_product.species_id=species.id)
INNER JOIN dbxref ON (gene_product.dbxref_id=dbxref.id)
WHERE
species.genus = 'Drosophila'
AND
species = 'melanogaster';
But I am not sure if the query is correct since the evidence codes that I get are seems to be imperfect: bash$ awk '{print $3}' downloaded_file | sort | uniq -c
6387 IDA, 921948 IEA, 2911 IEP, 113 IGC, 257 IMP, 123 ISA, 13 ISO, 664 ISS, 904 NAS, 583144 ND, 24598 RCA
accordnig to this query I got more kinds of evidence codes:
SELECT evidence.code, count(DISTINCT association.id)
FROM gene_product
INNER JOIN dbxref ON (gene_product.dbxref_id=dbxref.id)
INNER JOIN association ON gene_product.id=association.gene_product_id)
INNER JOIN species ON (gene_product.species_id=species.id)
INNER JOIN evidence ON association.id=evidence.association_id)
WHERE dbxref.xref_dbname = 'FB'
AND
species.genus = 'Drosophila'
AND
species = 'melanogaster'
GROUP BY evidence.code;
IBA 1254, IC 1130, IDA 11801, IEA 13541, IEP 974, IGC 18, IGI 2016, IKR 63, IMP 18364, IPI 1889, IRD 9, ISA 141, ISM 1301, ISO 1, ISS 14862, NAS 10277, ND 6899, RCA 19, TAS 4732
Cheers, Eszter Ari Institut für Populationsgenetik Vetmeduni Vienna Veterinärplatz 1 1210 Wien, Austria