Hi guys I have managed to write up the code below in python that accesses a file with protein ids and import their sequences from genbank. I now wanted to write one that would import all the protein sequences in a given chromosome since I don't have all their ids. i.e number. to import the entire protein sequences given the chromosome number.
Any suggestions would be appreciated!
Thank you
<h6>#</h6>from numpy import * z=genfromtxt('C:\Users\Mohammed\Desktop\ProteinIDs.txt', dtype='S12', delimiter=',', usecols=[0],unpack=True) exit
for i in range (500):
prot= '"%s"' %((z)[i])
print prot
from Bio import Entrez , SeqIO
Entrez.email = 'me@uga.edu'
handle = Entrez.efetch(db="protein", id="prot", rettype="fasta",retmode="text")
record = SeqIO.read(handle,"fasta")
String=str(record)
f= open('C:\Users\Mohammed\Desktop\protein_seqs\%s.txt' % (z)[i], 'w') for i in range (1): SeqIO.write(record, f, "fasta") print record f.close()
Can you indent your code? Put 4 spaces in front of it (this will make it a code block), and make sure that the for loops are indented correctly. This makes it easier for us to evaluate your code.