Retrieve Go Terms Using Uniprot Blasts Results (Together W/ Gene_Association.Goa_Uniprot.Gz)
1
1
Entering edit mode
11.3 years ago
shzhang ▴ 20

Hi all!

I got some differential expressed (DE) genes from a non-model RNA-seq project and I'd like to assign some GO ids to some of these DE genes.


I ran a blastx search of these DE genes against UniProtKB/Swiss-Prot using a cut-off E-value of 1e-5, and retained one best match (-max_target_seqs 1), the output of the blastx search was in xml format.

Then I downloaded the gene_association.goa_uniprot.gz.

I have two questions:

  • Is it necessary to run a blastx search of the DE genes which had no hits against UniProtKB/Swiss-Prot database against UniProtKB/TrEMBL database? (Since UniProt/Swissprot are curated, TrEMBL are automatic annotated)

  • I don't know how to use the blastx xml (or maybe tabular) result to retrieve GO ids from the goa_uniprot dataset. Is there any script for this purpose?

Thanks.

Kind regards,

Senhao

go uniprot • 4.9k views
ADD COMMENT
0
Entering edit mode
11.3 years ago

Hey,

Answering your second question directly: yes, I do have a script to do such. I used it once to annotate the Ciona genome for a inter-species comparison.

It was not made by me, the author is Laurent Manchon. Here's a link to a gist: split_xml_blast_output.awk

You might want to have a look at Blast2GO, to automate the annotation process of Blast results with GO terms.

ADD COMMENT
0
Entering edit mode

Hi Andre, thanks for your reply. The script is very useful to split blast xml results. Maybe I didn't express my question clearly. I'm wondering what's the relationship between this script and retrieve GO ids from gene_association.goa_uniprot.gz dataset?

I tried Blast2GO, it's so slow. So I'd like to do it locally.

Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2108 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6