Getting GI number from NCBI database through code
1
0
Entering edit mode
6.6 years ago
erans995 • 0

Hello As part of a college project, I have to write a program that finds similar FASTA sequences to a one the user chooses. In my program, say the user enters "cat", I have to display to him all the relevant entries present in the DB, and let him choose one. I have a script that outputs the FASTA data of a certain entry in the NCBI database given its accession number.

I have found the following perl script that converts GI to accession number:

use LWP::Simple;
$gi_list = '24475906,224465210,50978625,9507198';

#assemble the URL
$base = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/';
$url = $base . "efetch.fcgi?db=nucleotide&id=$gi_list&rettype=acc";

#post the URL
$output = get($url);
print "$output";

However, I haven't found a way to retrieve the GI from the database through code. Thank you for taking the time to read this, I hope you will be able to help me!

ncbi fasta gi code accession number • 2.7k views
ADD COMMENT
0
Entering edit mode

I normally don't post replies to homework or project-based questions, but I'll simply point to this post (and indicate you should point this out to your course instructor, it's been two years since the original announcement):

https://www.ncbi.nlm.nih.gov/books/NBK431010/#news_03-02-2016-phase-out-of-GI-numbers

ADD REPLY
0
Entering edit mode

Okay thanks for the update. Let me rephrase my question: how can I retrieve the accession number of a certain entry through code?

ADD REPLY
0
Entering edit mode

See my answer below.

ADD REPLY
1
Entering edit mode
6.6 years ago
GenoMax 147k

NCBI deprecated use of GI numbers in 2016. You should switch your code to using Accession numbers.

NCBI Unix utils allow you to query using gi and retrieve accessions numbers.

$ esearch -db nuccore -query "24475906" | efetch -format acc
NM_009417.2
ADD COMMENT
0
Entering edit mode

But that's the point, how can I retrieve the GI through code? I don't know it...

ADD REPLY
0
Entering edit mode

I thought you already had gi numbers. Using your "cat" example you can get accession numbers like this.

$ esearch -db nuccore -query "cat" | efetch -format acc

AFHV02000288.1
AFHV02000289.1
AFHV02000291.1
AFHV02000292.1
AFHV02000293.1
AFHV02000294.1

I will leave it to you to figure out how to change the query and how to use this method to do URL based searches.

ADD REPLY
0
Entering edit mode

Okay thank you very much, I'll try to figure out the rest by myself

ADD REPLY

Login before adding your answer.

Traffic: 2225 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6