Hello everyone
I have the following code that prints an accession number to a file based on the Gene ID the user entered:
use LWP::Simple;
#assemble the URL
$base = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/';
$url = $base . "efetch.fcgi?db=nucleotide&id=$ARGV[0]&rettype=acc";
#post the URL
$output = get($url);
my $filename = 'acc_num.txt';
open(FH, '>', $filename) or die $!;
print FH $output;
close(FH);
The accession number of the GID 3773153 written to the output file is: AI211211.1
I want the printed accession number to be: NC_007596.2
What am I supposed to change in my code?
Thanks in advance!!
I am not sure why do you think NC_007596.2 should be returned instead of AI211211.1 as AI211211.1 and NC_007596.2 do not seem to be linked on the NCBI web pages.
Wanting something is fine but is the example given real? Because
AI211211
does not seem to have any correlation toNC_007596.2
.Like I've said, the accession number AI211211.1 is the output of the program. This web page proves the correlation between the the GID and the desired accession number: https://www.ncbi.nlm.nih.gov/gene/?term=3773153
You should to stop using
gi
ID. While NCBI uses them internally, they stopped supporting their use externally two years back. You should always use actual accession numbers.Look, I've got the code I posted from the NCBI API page. It doesn't matter if it's still relevant or not, I just need to modify the code to print the desired accession number. Can you help me with that?
I guess you are searching different database in your script. I get following output using the NCBI command line eutils:
The following error is printed to the output file:
<entrezgene-set>
Cytb [Mammuthus primigenius (woolly mammoth)]
Cytb
<dl class="details"/><dl class="details"><dt class="desig">Other Designations: </dt><dd class="desig">Cytb; cytochrome b</dd></dl><dl class="details"><dt class="desig">Mitochondrion: </dt><dd class="desig">MT</dd></dl><dl class="details"><dt class="desig"> Annotation: </dt><dd class="desig">Chromosome MT, NC_007596.2 (14151..15286)</dd></dl>If I'm using perl, how am I supposed to access the database gene?
You'd need to change the db to
db=gene
in following lineHow can I extract just the accession number from the output file using Perl?
Up...
How can I extract just the accession number from the output file using Perl?