Best strategy to get GO terms for a proteome?
2
0
Entering edit mode
10.7 years ago
biotech ▴ 570

Hi,

I would like to know the best strategy to get the highest amount of GO terms for the bacterial proteome I'm working. Since it's a non-model organism, I will build the GO database from scratch.

I obtained 60% GO annotated proteome BLASTing to bacterial nr protein database (retaining first 20 hits), but some of them are very general. Same results were obtaining with BLAST2GO InterPro mapping.

I've been thinking to BLAST against uniprot and nr databases and merge results. Also, I would like to know how many hits should I retain from BLAST searches.

Thanks, Bernardo

P.S. I've also posted this question on Seqanswers forum.

uniprot nr blast2go BLAST GO • 3.2k views
ADD COMMENT
1
Entering edit mode
5.3 years ago
predeus ★ 2.1k

In case somebody is searching this: newest version of InterProScan works very nicely and adds GO terms as well as Reactome pathways to the annotation based on discovered domains. Of course, these approaches have a bunch of limitations, but still - I think this is the easiest way to do it. Took me about 8 hours on 64 cores for 38,000 proteins.

interproscan.sh -i <protein_fasta> -f tsv -b <output> -goterms -cpu 64 -etra -pa

ADD COMMENT
0
Entering edit mode
10.7 years ago
pld 5.1k

You should only be retaining one hit per subject species and these hits should be verified through reciprocal blast. Multiple hits in a single species for a given gene of interest doesn't make sense. Its equivalent to saying your gene of interest does all of the functions of the n genes in the subject species.

You can merge results, but I imagine that you'll probably get a great deal of duplicates. I'm not very familiar with BLAST2GO, but RefSeq doesn't naturally have GO annotation, so B2G must be pulling those from somewhere else.

Having many generic results after GO annotation is common, at least in my experience, if a given gene's orthologs are poorly annotated, you can't do any better.

ADD COMMENT

Login before adding your answer.

Traffic: 1673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6