Using entrez-direct inside a for loop in bash
1
0
Entering edit mode
5.7 years ago
Lille My ▴ 30

I'm trying to retrieve the genome from which a series of proteins are derived. there is more than one assembly for each protein, so I need to create a file where they are linked.

I use the following:

for id in `cat gi-list-file`; do
elink -target nuccore -db protein -id $id |
elink -target assembly |
esummary  |
 xtract -pattern AssemblyAccession -element AssemblyAccession
done

the first result I get is the assembly accession, but the second result is the following error message:

Retrying elink, step 2: callMLink: Error reading an UID blob, ,CNCHistory::ReadIdListBuf, result (false) error, blobid=empty

Any ideas on what the problem is?

NCBI bash • 2.5k views
ADD COMMENT
0
Entering edit mode

what happens if you run that accession directly outside of the loop?

Can you also show us some examples of accessions which work, and some that don't?

ADD REPLY
0
Entering edit mode

Hi! I tried to do something similar, but it doesn't work. I have a list of Pubmed IDs and I want to retrieve their abstracts.

for i in `cat only_retrieved_pubmedIDs.csv` ; do echo $i; efetch -input $i -db pubmed -format abstract > $i.txt ; done

One file per Pubmed ID is produced but they are all empty.

ADD REPLY
0
Entering edit mode

While your questions is unrelated to the original thread you should do the following (one PMID per line in input file):

$ for i in `cat id_file`; do efetch -db  pubmed -id ${i} -format abstract; done
ADD REPLY
1
Entering edit mode
5.7 years ago
GenoMax 147k

Following works for me.

$ more names.txt
WP_043107373
WP_000617546.1
WP_000906486.1
WP_001096206.1
WP_001386830.1

Loop used for the lookups

$ for i in `cat names.txt`; do echo $i; elink -target nuccore -db protein -id $i |elink -target assembly|esummary |xtract -pattern AssemblyAccession -element AssemblyAccession > $i.txt; done

This should produce one file per input Accession number.

With a non-existent accession number an error will be generated and result in an empty file for that accession. Loop should continue for rest of the accession.

WP_031373
ERROR in link output: BLOB ID IS NOT IMPLEMENTED

Actual error message is larger, truncated for display.

ADD COMMENT

Login before adding your answer.

Traffic: 1736 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6