Entering edit mode
4.0 years ago
sepidehafshar303
•
0
Hi I have a file of gene names and I want to find their Ensembl-id using python, but I have error, how can fix it?
nDis = 100000
with open('geo.csv', 'rt', encoding='utf-8') as csvfile:
mg = mygene.MyGeneInfo()
csvreader = csv.reader(csvfile, delimiter=',')
raw_file = []
n = 0
counter = 0
for row in csvreader:
counter = counter + 1
if counter is not 1 and counter < nDis:
#print('Preprocessing gene data: ' + str(counter))
d = {}
d['geneName'] = row[9]
raw_file.append(d)
#print(d)
for gene in raw_file:
result = mg.query(gene, scopes="symbol", fields=["ensembl"], species="human", verbose=False)
hgnc_name = gene
for hit in result["hits"]:
if "ensembl" in hit and "gene" in hit["ensembl"]:
sys.stdout.write("%s\t%s\n" % (hgnc_name, hit["ensembl"]["gene"]))
HTTPError: 400 Client Error: Bad Request for url: http://mygene.info/v3/query/
In the future, please properly format your post, in particular the code using the code formatting button (the one with 1s and 0s). This makes the post much easier to understand. I have done it for you this time.
Thank you so much for correcting my post, I didn't know how should I change it to code format.
As an easier alternative, you can try using the db2db conversion tool at https://biodbnet-abcc.ncifcrf.gov/db/db2db.php. It has many options to convert to and from, including your goal items of Ensemble ID and gene names.
Thank you so much it was so easy to use I got ansewr.