Hi,
I would like to use https://github.com/tanghaibao/goatools . For that I need a association file which contains the gene to GO term mapping which is a two-column tabular file, first column - gene, second column - go terms (separated by ; if there are multiple terms).
Where it is possible to download such such file for:
* Arabidopsis
* Oryza sativa
Thank you for the links. I would be also interested in a generic way, because I work also on completely new organisms. So I downloaded the following files
If you are working on a completely new organism, then there won't be any existing GO terms mapping to your gene, since no one knows what your gene is. You can generate de novo GO terms for your sequence by scanning your sequences with various HMM databases, and then mapping GO terms to the HMM database IDs (http://www.geneontology.org/GO.indices.shtml), or you can use blast2go (http://www.blast2go.com/b2ghome).
The file idmapping contains mapping from uniprot to other sequence databases (refseq, genbqnk etc.), but maybe not to GO.
The ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/proteomes/ repository contains the GO annotation for each species. You should be able to download the rice one and if you're using Uniprot references as you looks like, you just have to select the right columns.
If you really want the gene id, you can also go to Uniprot, search for your species (search taxonomy) and click on the Uniprot link. From here, you can customize the display to show only gene name and go term and export it in a tab-delimited file.
If you are working on a completely new organism, then there won't be any existing GO terms mapping to your gene, since no one knows what your gene is. You can generate de novo GO terms for your sequence by scanning your sequences with various HMM databases, and then mapping GO terms to the HMM database IDs (http://www.geneontology.org/GO.indices.shtml), or you can use blast2go (http://www.blast2go.com/b2ghome).
First of all be careful, the ID you are looking for, Q8L9A8, is a protein ID (uniprot), not a gene ID.