Iterating through my list of search terms, only returns results from the last term in the list
0
1
Entering edit mode
5.0 years ago
jwang • 0

Hi I've been trying to loop through a list of search times I'm interested in. When I apply the first part of my code: I only get out publications matching the last search term in my list. Instead I want it to iterate through the list, and append the record_list object for each new search term, not just overwrite with the results from the last search. Thanks!!

from Bio import Entrez
from Bio import Medline
from tqdm import tqdm
import pandas as pd
pd.set_option('display.max_colwidth', -1)
import numpy as np
# Change this email to your email address
Entrez.email = "Put your email address here"


disease_list = ['ebola', 'aml', 'primary glomerular disease associated with significant proteinuria']

#search and return total number of publications 
def search(x):
    Entrez.email=Entrez.email
    results = {}
    for x in disease_list:
        keyword = x
        handle = Entrez.esearch(db ='pubmed',
                                retmax=1000,
                                retmode ='text',
                                term = keyword)
        results= Entrez.read(handle)
        print('Total number of publications that contain the term {}: {}'.format(keyword, results['Count']))    
    for keyword, results['Count'] in results:
        results[x].append(results['Count'])
    return results

if __name__ == '__main__':
    results = search(disease_list)
Esearch Entrez.esearch Pubmed Abstracts • 1.1k views
ADD COMMENT
1
Entering edit mode

If youre only getting the last of an entry in a loop, its because somewhere your loop is overwriting the entry instead of appending new ones.

I've tried to fix your formatting for the code, but at the moment, your indentation and loop structures are not clear, so its not obvious which bits you mean to have in which loop, and thus where your problem comes from. Please make sure the code appears correct.

ADD REPLY
0
Entering edit mode

Sorry about that, here let's try again, I turned it into a function so it's easier to read, this is the part I'm getting stuck on:

from Bio import Entrez
from Bio import Medline
from tqdm import tqdm
import pandas as pd
pd.set_option('display.max_colwidth', -1)
import numpy as np
# Change this email to your email address
Entrez.email = "put your email address here"

#############################################################################
disease_list = ['ebola', 'aml', 'primary glomerular disease associated with significant proteinuria']

#search and return total number of publications 
def search(x):
    Entrez.email=Entrez.email
    results = {}
    for x in disease_list:
        keyword = x
        handle = Entrez.esearch(db ='pubmed',
                                retmax=1000,
                                retmode ='text',
                                term = keyword)
        results= Entrez.read(handle)
        print('Total number of publications that contain the term {}: {}'.format(keyword, results['Count']))    
    for keyword, results['Count'] in results:
        results[x].append(results['Count'])
    return results

if __name__ == '__main__':
    results = search(disease_list)
ADD REPLY

Login before adding your answer.

Traffic: 1917 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6