Hi all!
I got some differential expressed (DE) genes from a non-model RNA-seq project and I'd like to assign some GO ids to some of these DE genes.
I ran a blastx search of these DE genes against UniProtKB/Swiss-Prot using a cut-off E-value of 1e-5, and retained one best match (-max_target_seqs 1), the output of the blastx search was in xml format.
Then I downloaded the gene_association.goa_uniprot.gz
.
I have two questions:
Is it necessary to run a blastx search of
the DE genes which had no hits against UniProtKB/Swiss-Prot database
against UniProtKB/TrEMBL database? (Since UniProt/Swissprot are curated, TrEMBL are automatic annotated)I don't know how to use the blastx xml (or maybe tabular) result to retrieve GO ids from the goa_uniprot dataset. Is there any script for this purpose?
Thanks.
Kind regards,
Senhao
Hi Andre, thanks for your reply. The script is very useful to split blast xml results. Maybe I didn't express my question clearly. I'm wondering what's the relationship between
this script
andretrieve GO ids from gene_association.goa_uniprot.gz dataset
?I tried Blast2GO, it's so slow. So I'd like to do it locally.
Thanks.