I am trying to look for data retrieved from Islandviewer,
the list file shows the accession numbers of genomes and their corresponding start-end position of the gene-island.
for example:
accession start end
NC_000853.1 43768 64411
NC_000853.1 659002 664820
NC_000907.1 1498827 1513221
and so on
I want to look for what genes are inside those start-end position.
so i did the following coding, with the first sample from the above list
from Bio.Seq import Seq
from Bio.SeqFeature import SeqFeature, FeatureLocation
from Bio import Entrez
from Bio.Blast import NCBIWWW
with Entrez.efetch (db = "nucleotide", rettype = 'fasta', id = 'NC_000853.1', retmode = 'text') as handle:
seq = SeqIO.read(handle, "fasta")
parent = Seq(str(seq.seq))
feature = SeqFeature(FeatureLocation(43768, 64411), type = "gene", strand = 1)
g_island = feature.extract(parent)
result = NCBIWWW.qblast("blastn", "nt", g_island)
print (result)
I extracted the sequence from the start to end position and i tried to blast search the sequence data (from start point to end point) within biopython but it didnt work. how should I parse them?
Hello seok1213neo,
you should show us, how your data looks like exactly. Otherwise we can just guess.
fin swimmer
thank you for your reply! i have edited my post