Entering edit mode
6.5 years ago
Arup Ghosh
3.2k
I'm trying to convert some Biosample ids to SRA is using the following python script but the response time is very high. Is there any faster way to do the same?
#!/usr/bin/python3
import urllib.request
import sys
from bs4 import BeautifulSoup
# example url :://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=biosample&cmd=neighbor_score&linkname=bioproject_sra_all&db=sra&id=235777
idconv="https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=biosample&cmd=neighbor_score&linkname=bioproject_sra_all&db=sra&id="
with open(sys.argv[1],"r") as uids:
for uid in uids:
url=idconv+uid
req = urllib.request.Request(url, data=None,headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'})
print(uid)
page=urllib.request.urlopen(req)
soup=BeautifulSoup(page,"lxml")
print(soup.prettify())
Not a programatic approach but you can use batch entrez (https://www.ncbi.nlm.nih.gov/sites/batchentrez) Batch Entrez -> Select Data base BioSample -> Upload biosample id list -> Retrieve Records -> Select summary Text -> Download The file -> Grep "^Identifiers"
Not accepting more than 20 entries at a time.
Sorry my bad. You need to choose send to -> file option and then select Summary (Text). The web page by default displays only 20 entries.