Hi,
I am trying to access gene sequences of a given organism using BioPython. After search on a Gene database using e.g. "Escherichia coli[organism]" as query, I am downloading XML file containing genes locus and identifiers. Now, using those information I want to retrieve sequence from the entry in nucleotide database. For this purpose I type:
In [144]: handle = Entrez.efetch(db="nucleotide", id="NC_005327", rettype='gb', retmode='text')
In [145]: rec = SeqIO.read(handle, 'genbank')
In [146]: rec.seq
Out[146]: UnknownSeq(92353, alphabet = IUPACAmbiguousDNA(), character = 'N')
In [147]: print rec.seq.tostring()
Out[147]: NNNNNNNNNNNNNNNNNNNN (.....) NNNNNNNNNNNNNNNNNNNNNNNNN
When I am accessing nucleotide db entry of a given accession through the website, sequence is there and respective genes can be mapped onto it. What's wrong? Am I screwing something during parsing?
Thanks in advance for any insight!
Works perfectly, thanks!