NCBI's Efetch not working
1
0
Entering edit mode
3.0 years ago
Jonathan ▴ 20

Any help would be much appreciated. My goal is to run the following for loop to generate a list of sample_id (which is actually isolation site) for a list of SRAs. However I get an error (see below) for each and every SRA.

 for sra in `awk 'NR>1{print $1}' metadata.txt`
 do
     esearch -db sra -query $sra | efetch -format text | grep "sample_id" | sed 's/^.*G_DNA_//' | sed 's/"//' >> isolation_sites.txt
 done

I also get the same error (see below) when I run it without the for loop, but instead a single line with just a single sample (i.e., no for loop).

 esearch -db sra -query SRS056042 | efetch -format txt

The error is as follows:

curl: (23) Failed writing body

ERROR: curl command failed ( Tue Dec 7 14:17:34 EST 2021 ) with: 23

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=sra&term=Unable%20%5BACCN%5D%20OR%20to%20%5BACCN%5D%20OR%20locate%20%5BACCN%5D%20OR%20xtract%20%5BACCN%5D%20OR%20executable.%20%5BACCN%5D%20OR%20Please%20%5BACCN%5D%20OR%20execute%20%5BACCN%5D%20OR%20the%20%5BACCN%5D%20OR%20following%20%5BACCN%5D%20OR%20nquire%20%5BACCN%5D%20OR%20dwn%20%5BACCN%5D%20OR%20ftp.ncbi.nlm.nih.gov%20%5BACCN%5D%20OR%20entrez%20%5BACCN%5D%20OR%20entrezdirect%20%5BACCN%5D%20OR%20xtract.Darwin.gz%20%5BACCN%5D%20OR%20gunzip%20%5BACCN%5D%20OR%20f%20%5BACCN%5D%20OR%20xtract.Darwin.gz%20%5BACCN%5D%20OR%20chmod%20%5BACCN%5D%20OR%20x%20%5BACCN%5D%20OR%20xtract.Darwin%20%5BACCN%5D&retmax=10000

HTTP/1.1 200 OK

curl: (22) The requested URL returned error: 400

ERROR: curl command failed ( Tue Dec 7 14:17:35 EST 2021 ) with: 22

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi -d db=sra&id=x%2Cdwn%2Cf%2Cchmod%2Centrez%2Centrezdirect%2Cftp.ncbi.nlm.nih.gov%2Cgunzip%2Cnquire%2Cxtract.Darwin%2Cxtract.Darwin.gz%2CPlease%2CUnable%2Cexecutable.%2Cexecute%2Cfollowing%2Clocate%2Cthe%2Cto%2Cxtract&rettype=txt&retmode=text&tool=edirect&edirect=16.2&edirect_os=Darwin&email=home%40JJGs-MacBook-Pro.local

HTTP/1.1 400 Bad Request

WARNING: FAILURE ( Tue Dec 7 14:17:34 EST 2021 )

nquire -url https://eutils.ncbi.nlm.nih.gov/entrez/eutils/ efetch.fcgi -db sra -id x,dwn,f,chmod,entrez,entrezdirect,ftp.ncbi.nlm.nih.gov,gunzip,nquire,xtract.Darwin,xtract.Darwin.gz,Please,Unable,executable.,execute,following,locate,the,to,xtract -rettype txt -retmode text -tool edirect -edirect 16.2 -edirect_os Darwin -email home@JJGs-MacBook-Pro.local

EMPTY RESULT

edirect efetch • 2.7k views
ADD COMMENT
0
Entering edit mode

The command below

esearch -db sra -query SRS056042 | efetch -format txt

does work for me on MacOS

the error message seems to indicate some sort of installation error.

Unable,executable.,execute,following,locate,the,to,xtrac

ADD REPLY
1
Entering edit mode
3.0 years ago
GenoMax 147k

I don't see that issue on macOS. So you must have a problem with your EntrezDirect install as noted by @Istvan. Best option is to use conda for the install.

$  esearch -db sra -query SRS056042 | efetch -format txt | grep "sample_id" | sed 's/^.*G_DNA_//' | sed 's/"//'
Buccal mucosa of a female participant in the dbGaP study "HMP Core Microbiome Sampling Protocol A (HMP-A)" spots="13744700" bases="2748940000" tax_id="646099" organism="human metagenome"><IDENTIFIERS><PRIMARY_ID>SRS056042</PRIMARY_ID><EXTERNAL_ID namespace="BioSample">SAMN00076051</EXTERNAL_ID><EXTERNAL_ID namespace="dbGaP" label="Sample name">228-700106303</EXTERNAL_ID><EXTERNAL_ID namespace="phs000228" label="submitted sample id">700106303</EXTERNAL_ID></IDENTIFIERS></Member></Pool><CloudFiles><CloudFile filetype="run" provider="gs" location="gs.US"/><CloudFile filetype="run" provider="s3" location="s3.us-east-1"/></CloudFiles><Statistics nreads="2" nspots="13744700"><Read index="0" count="13744700" average="100" stdev="0"/><Read index="1" count="13744700" average="100" stdev="0"/></Statistics><Bases cs_native="false" count="2748940000"><Base value="A" count="32241253"/><Base value="C" count="25742833"/><Base value="G" count="25774267"/><Base value="T" count="32172905"/><Base value="N" count="2633008742"/></Bases></RUN></RUN_SET><ControlledAccess study="phs000228" consent="HMP"/></EXPERIMENT_PACKAGE>
ADD COMMENT
1
Entering edit mode

Thank you! Looks like it was an installation error, as you suggested. The original install command came from the NCBI website (https://www.ncbi.nlm.nih.gov/books/NBK179288/)

 sh -c "$(curl -fsSL ftp://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/install-edirect.sh)"

I just re-installed via conda and everything works now as expected.

conda install -c bioconda entrez-direct
ADD REPLY

Login before adding your answer.

Traffic: 1586 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6