After finding the corresponding gene IDs to a list of gene symbols, I am trying to grab the gene summary for those gene IDs. When I use Entrez.read
to parse the summaries I have grabbed (using Entrez.esummary
), I get a weirdly-structured list/dictionary for which I can't make out the keys. For example, below I try to print out the values under OtherDesignations
, and I get a key error. Can someone help me out?
import sys
from Bio import Entrez
import xlrd
Entrez.email = "john.doe@mail.com"
wb = xlrd.open_workbook('C:/Users/user/geneSymbolsTest.xlsx')
sh = wb.sheet_by_index(0)
colA = sh.col_values(0)
colA.pop(0)
symbol_list = []
for x in colA:
symbol_list.append(str(x))
id_list = []
summary = []
parsedSummary = []
for x in symbol_list:
sterm = x + '[sym] "Mus musculus"[orgn]'
handle = Entrez.esearch(db="gene", retmode = "xml", term = sterm )
record = Entrez.read(handle)
IDArray = record["IdList"]
toString = str(IDArray[0])
summary = Entrez.esummary(db="gene", retmode = "xml", id = toString)
parsedSummary = Entrez.read(summary)
entry = parsedSummary[0]["OtherDesignation"]
print entry
Hi, can you please add an "x" example, in order to test your code as you did.