Hi,
I tried to search the NCBI Gene database using Biopython.Entrez for the following random terms: ["egfr","pi3k","puma"].
I was expecting to get the entries for the EGFR, Pi3K and PUMA, However, the idList always returned '7157'(TP53) as the first item for all 3 entries.
Before I ran this test, I searched for TP53 using esearch and it correctly returned 7157 as the first entry. However, every subsequent esearch query returned 7157 as the first element for every single query, even though TP53 is not related to any of the queries. Here is my code:
idlist=[]
terms=["egfr","pi3k","puma"]
for termd in terms:
print("search for ",termd)
info2= Entrez.esearch(db ="gene",term=termd)
print("\n \n", info2)
record2= Entrez.read(info2)
print(record2)
idlist.append( record2["IdList"][0])
print(idlist)
and here is what the shell returned:
search for egfr
<_io.TextIOWrapper encoding='utf-8'>
{'Count': '6303', 'RetMax': '20', 'RetStart': '0', 'IdList': ['7157', '1956', '7124', '7422', '3569', '7040', '22059', '2064', '2099', '3091', '351', '672', '4318', '9370', '5243', '1401', '207', '367', '4790', '21898'], 'TranslationSet': [], 'TranslationStack': [{'Term': 'egfr[All Fields]', 'Field': 'All Fields', 'Count': '6303', 'Explode': 'N'}, 'GROUP'], 'QueryTranslation': 'egfr[All Fields]'}
search for pi3k
<_io.TextIOWrapper encoding='utf-8'>
{'Count': '15392', 'RetMax': '20', 'RetStart': '0', 'IdList': ['7157', '1956', '7124', '7422', '3569', '7040', '22059', '2064', '2099', '3586', '3091', '351', '672', '4318', '9370', '5243', '1401', '207', '367', '4790'], 'TranslationSet': [], 'TranslationStack': [{'Term': 'pi3k[All Fields]', 'Field': 'All Fields', 'Count': '15392', 'Explode': 'N'}, 'GROUP'], 'QueryTranslation': 'pi3k[All Fields]'}
search for puma
<_io.TextIOWrapper encoding='utf-8'>
{'Count': '24385', 'RetMax': '20', 'RetStart': '0', 'IdList': ['7157', '1956', '7124', '7040', '22059', '2064', '207', '21898', '6774', '3845', '1029', '4609', '2475', '21803', '596', '5594', '1026', '332', '11651', '355'], 'TranslationSet': [{'From': 'puma', 'To': '"Puma concolor"[Organism] OR "Puma"[Organism] OR puma[All Fields]'}], 'TranslationStack': [{'Term': '"Puma concolor"[Organism]', 'Field': 'Organism', 'Count': '23959', 'Explode': 'Y'}, {'Term': '"Puma"[Organism]', 'Field': 'Organism', 'Count': '23996', 'Explode': 'Y'}, 'OR', {'Term': 'puma[All Fields]', 'Field': 'All Fields', 'Count': '24385', 'Explode': 'N'}, 'OR', 'GROUP'], 'QueryTranslation': '"Puma concolor"[Organism] OR "Puma"[Organism] OR puma[All Fields]'}
['7157', '7157', '7157']
you can see that the rest of the properties are different, but the very first element in idList is always 7157.
How do I get around this problem?
Thank you very much.