Hi,
I wasn't able to find anywhere how to pass several seq_start and seq_stop optional arguments to list of queries for NCBI efetch.
See this:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?
db=nuccore&id=433294648rettype=fasta&seq_start=100&seq_stop=200
server ansver:
>gb|CP003078.1|:100-200 Mycobacterium sp. JS623, complete genome
GGGTCGCAGCCGTATCGCCACGTTCGGGCGACTGTTCGAGGGTACTGACGACATTTCGCTGGGTCAAACC
TCGCCCGAGCGATCCCGGGTCACCGCCCGCA
And now multiple queries:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?
db=nuccore&id=433294648,755160968&rettype=fasta
Server ansver: 2 fasta whole records in one file in a blink of an eye.
Question:
Does anybody know, if it is possible, and if so, than how to combine those to obtain 1 short fasta record per UID posted, determined by seq_start & seq_stop arguments?
So the server answer to something like:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?
db=nuccore&id=433294648,755160968&rettype=fasta&seq_start=100,200&seq_stop=200,500
would be:
>gb|xxxxxxxx.x|:100-200 orgn x
GGGTCGCAGCCGTATCGCCACGTTCGGGCGACTGTTCGAGGGTACTGACGACATTTCGCTGGGTCAAACC
TCGCCCGAGCGATCCCGGGTCACCGCCCGCA>gb|yyyyyyyy.y|:200-500 orgn y
GGGTCGCAGCCGTATCGCCACGTTCGGGCGACTGTTCGAGGGTACTGACGACATTTCGCTGGGTCAAACC
TCGCCCGAGCGATCCCGGGTCACCGCCCGCA
What I'have tried so far is comma-separated list of seq_start&stop, putting it into [], add +AND+, add semicolon, anything I could thing of.
I know how to solve this in for-loop but it would help me a lot, if I could do this in 'batch' mode.
Any suggestion would be appreciated. Thanks a lot.
Ps.: I have already asked this here: C: Fetching Genbank Entries For List Of Accession Numbers., but it feels little of topic and question was not elaborated.
You can use the Unix e-utils and write a bash script to parse the file to take seq_start and seq_stop values for each line. Sample command would be
PS: NCBI is phasing out GI numbers so it is recommended to use accession numbers instead.
Hi, Than you for reply. I know that I can do that in a for loop, (and currently doing so); But since I want to fetch relatively short fragments, I want to fetch them all witch one command (reasonable number) to limit the calls to NCBI server. Or am I missing something and this is what the UNIX e-utils would inherently do by itself?
To Ps.: Yes, I know of that, currently it is working with accession, but it is undocumented according to: (http://www.ncbi.nlm.nih.gov/books/NBK25499/)