Retrieve GO terms ontology
2
0
Entering edit mode
8.7 years ago

Hi, I have a list of 141 GO terms. I downloaded them from biomart ensembl (GOAslim). I want to know if there is an easy way to have their top ontology(cellular component, molecular function, biological process). For each term I would like to have:

GO:XXXXXX1 is a cellular component

GO:XXXXXX2 is a molecular function

GO:XXXXXX3 is a biological process

I have made some research but I havn't found easy way to do it. Before going to "complicate job" I wonder if there is a simple way, or a tools to doing this.

GO • 4.4k views
ADD COMMENT
5
Entering edit mode
8.7 years ago

Using bash and quickgo:

function fun1
    {
    TERM=`curl -s "https://www.ebi.ac.uk/QuickGO/GTerm?id=${1}&format=oboxml" | xmllint --xpath 'normalize-space(//is_a[1]/text())' - `
    if [[ ${TERM} == "" ]] ; then echo "$1" ; else   fun1 "${TERM}" ; fi
    }

while read A
do
    echo -n "${A} "
    fun1 "${A}"
done

example:

echo -e "GO:0003674\nGO:0097159" | bash go.sh
GO:0003674 GO:0003674
GO:0097159 GO:0003674
ADD COMMENT
1
Entering edit mode

Keep getting amazed of your xml manipulation fluency.

ADD REPLY
0
Entering edit mode

Thanks that exactly what I was looking for! you get me out of trouble .

ADD REPLY
2
Entering edit mode
8.7 years ago
biocyberman ▴ 870

Check out GOParGenPy tool or download the annotation file for your species and extract information with gawk or your favorite language: http://geneontology.org/page/download-annotations

The data is in Geneontology Annotation Format: http://geneontology.org/page/go-annotation-file-format-20 And the columns you want to extract are column 5 and 9.

ADD COMMENT
0
Entering edit mode

thanks for your answers but I already have GO annotation for a subset of gene. My trouble was to know in which ontology they are (cellular component, molecular function, biological process).

ADD REPLY
3
Entering edit mode

Hi, please use following steps in R:

ids <- c("GO:2000096", "GO:2000145", "GO:2001020")  #### your GO ids
    library(GO.db)
    head(select(GO.db, ids, "ONTOLOGY"))
    output will be:
            GOID ONTOLOGY
    1 GO:2000096       BP
    2 GO:2000145       BP
    3 GO:2001020       BP

where
BP= biological process
MF= molecular function

etc... Hope this will help you..

ADD REPLY
0
Entering edit mode

thanks for you answer I already have my result with Pierre Lindenbaum answer but it is always nice to discover new library!

ADD REPLY
1
Entering edit mode

Just to clarify: GOParGenPy: Despite the name, the tools is actually more useful to extract annotation and combine various annotation information. Both using GOParGenPy and the manual method can save running time and make you less dependent on the queries over internet.

ADD REPLY
0
Entering edit mode

It is a very interresting tool thanks again !

ADD REPLY

Login before adding your answer.

Traffic: 2491 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6