Functional Annottations From Blast.
2
Hello,
I'm writing a program that takes sequences in some format (FASTA for example) performs a BLAST search on it and takes a funcional annotation of each result, then looks in Gene Ontology database to get the standard representation of these annotations. The problem is that BLAST returns only similar sequences not the annotations of this sequences. What I get is only a description and accession number (like AK333026). I don't know what do do with this number. Look up some database? OK, but I think this is not a single database I have to look up. So having accesion number how can I determine, which database should I get? Also, I'm doing this using biopython, and I'm not qute sure if it provides interface for looking up all those DBs. Any hints?
biopython
blast
• 3.0k views
The identifiers are from NCBI's database, which you query
using Entrez . Biopython provides an interface to retrieve
records from Entrez, with full documentation in the Tutorial .
To query and retrieve the GenBank record for your example:
from Bio import Entrez
from Bio import SeqIO
Entrez.email = "test@example.com"
accession = "AK333026"
handle = Entrez.esearch(db="nucleotide", term=accession)
rec = Entrez.read(handle)
search_id = rec["IdList"][0]
handle = Entrez.efetch(db="nucleotide", id=search_id, rettype="gb",
retmode="text")
rec = SeqIO.read(handle, "genbank")
print rec
You can use the resulting Biopython SeqRecord to retrieve the information you're interested in.
Related ?: I wrote a java program that displays the genbank annotations for the BLAST Hit and the BLAST Query. See my post: http://plindenbaum.blogspot.fr/2010/11/blastxmlannotations.html
QUERY: Homo sapiens eukaryotic translation initiation factor 4 gamma, 1 (EIF4G1), transcript variant 2, mRNA
ID:gi|303227906|ref|NM_198241.2| Len:5538
>Mus musculus eukaryotic translation initiation factor 4, gamma 1 (Eif4g1), transcript variant 2, mRNA
NM_001005331
id:gi|56699433|ref|NM_001005331.1| len:5460
e-value:0 gap:138 bitScore:6818.02
#####:############################################ exon 1..180 gene:EIF4G1
QUERY 000000053 GGCGCCGGCTGCGCCTGCGGAGAAGCGGTGGCCGCCGAGCGGGATCTGTG 000000102
||||| ||||||||||||||||||||||||||||||||||||||||||||
HIT 000000001 GGCGCTGGCTGCGCCTGCGGAGAAGCGGTGGCCGCCGAGCGGGATCTGTG 000000050
#####:############################################ exon 1..128 gene:Eif4g1
################################################## exon 1..180 gene:EIF4G1
QUERY 000000103 CGGGGAGCCGGAAATGGTTGTGGACTACGTCTGTGCGGCTGCGTGGGGCT 000000152
||||||||||||||||||||||||||||||||||||||||||||||||||
HIT 000000051 CGGGGAGCCGGAAATGGTTGTGGACTACGTCTGTGCGGCTGCGTGGGGCT 000000100
################################################## exon 1..128 gene:Eif4g1
############::::::::::###### exon 1..180 gene:EIF4G1
#:::::::::::::###::::: exon 181..237 gene:EIF4G1
QUERY 000000153 CGGCCGCGCGGACTGAAGGAGACTGAAGGCCCTCGGATGCCCAGAACCTG 000000202
|||||||||||| ||||||| |||
HIT 000000101 CGGCCGCGCGGA----------CTGAAGG-------------AGA----- 000000122
############----------#######-------------### gene 1..5460 gene:Eif4g1
############----------#######-------------### exon 1..128 gene:Eif4g1
::::::::::::::::::::::##:##:::::::# exon 181..237 gene:EIF4G1
############### exon 238..331 gene:EIF4G1
QUERY 000000203 TAGGCCGCACCGTGGACTTGTTCTTAATCGAGGGGGTGCTGGGGGGACCC 000000252
|| || ||||||||||||||||
HIT 000000123 ----------------------CTGAA-------GGTGCTGGGGGGACCC 000000143
----------------------##:##-------# exon 1..128 gene:Eif4g1
############### exon 129..222 gene:Eif4g1
#:###############################:###:############ exon 238..331 gene:EIF4G1
##############:###:############ CDS 272..5071 gene:EIF4G1
QUERY 000000253 TGATGTGGCACCAAATGAAATGAACAAAGCTCCACAGTCCACAGGCCCCC 000000302
| ||||||||||||||||||||||||||||||| ||| ||||||||||||
HIT 000000144 TAATGTGGCACCAAATGAAATGAACAAAGCTCCCCAGCCCACAGGCCCCC 000000193
#:###############################:###:############ exon 129..222 gene:Eif4g1
##############:###:############ CDS 163..4944 gene:Eif4g1
(...)
############:#:#:#####:######:#:########:##:###### exon 4890..5521 gene:EIF4G1
############:#:#:#####:######:#:########:##:###### STS 4948..5505 gene:EIF4G1
############:#:#:#####:######:#:########:##:###### STS 5174..5403 gene:EIF4G1
QUERY 000005319 TTGGTGTGTCTTGGGGTGGGGAGGGGCACCAACGCCTGCCCCTGGGGTCC 000005368
|||||||||||| | | ||||| |||||| | |||||||| || ||||||
HIT 000005201 TTGGTGTGTCTTTGCGGGGGGAAGGGCACTACCGCCTGCCTCTAGGGTCC 000005250
############:#:#:#####:######:#:########:##:###### exon 4760..5396 gene:Eif4g1
::##############:##########:###################### exon 4890..5521 gene:EIF4G1
::##############:##########:###################### STS 4948..5505 gene:EIF4G1
::##############:##########:####### STS 5174..5403 gene:EIF4G1
QUERY 000005369 TTTTTTTTATTTTCTGAAAATCACTCTCGGGACTGCCGTCCTCGCTGCTG 000005418
|||||||||||||| |||||||||| ||||||||||||||||||||||
HIT 000005251 --TTTTTTATTTTCTG-AAATCACTCTTGGGACTGCCGTCCTCGCTGCTG 000005297
--##############-##########:###################### exon 4760..5396 gene:Eif4g1
######################:#############:############# exon 4890..5521 gene:EIF4G1
######################:#############:############# STS 4948..5505 gene:EIF4G1
QUERY 000005419 GGGGCATATGCCCCAGCCCCTGTACCACCCCTGCTGTTGCCTGGGCAGGG 000005468
|||||||||||||||||||||| ||||||||||||| |||||||||||||
HIT 000005298 GGGGCATATGCCCCAGCCCCTGCACCACCCCTGCTGCTGCCTGGGCAGGG 000005347
######################:#############:############# exon 4760..5396 gene:Eif4g1
#:##-############################################: exon 4890..5521 gene:EIF4G1
#:##-################################# STS 4948..5505 gene:EIF4G1
###### polyA_signal 5496..5501 gene:EIF4G1
# polyA_site 5516 gene:EIF4G1
QUERY 000005469 GGAA-GGGGGGGCACGGTGCCTGTAATTATTAAACATGAATTCAATTAAG 000005517
| || ||||||||||||||||||||||||||||||||||||||||||||
HIT 000005348 GAAAGGGGGGGGCACGGTGCCTGTAATTATTAAACATGAATTCAATTAAA 000005397
#:##:############################################ exon 4760..5396 gene:Eif4g1
:::# exon 4890..5521 gene:EIF4G1
# polyA_site 5521 gene:EIF4G1
QUERY 000005518 CTCAAAAAAAAAAAAAAAAAA 000005538
||||||||||||||||||
HIT 000005398 AAAAAAAAAAAAAAAAAAAAA 000005418
Login before adding your answer.
Traffic: 2490 users visited in the last hour
What do you want to do? An alternate to blast2GO?