Entrez-efetch error. What could be wrong with my efetch usage?
1
0
Entering edit mode
5.6 years ago
jaqx008 ▴ 110

Hey everyone, I am trying to download an sra file via the unix command line and below is my efetch command. Many did encountered this error and found solutions from the recommendations. But I havent. I am thinking something is wrong with my efetch usage.

efetch -db sra -format fastq -id SAMD00028077 > Blastula.fastq

I get the following error and all the info for that sample is output into the fastq

Error

400 Bad Request No do_post output returned from 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=sra&id=SAMD00028077&rettype=fastq&retmode=text&edirect_os=linux&edirect=11.2&tool=edirect&email=edthompson@magnolia01' Result of do_post http request is $VAR1 = bless( {
                 '_rc' => 400,
                 '_protocol' => 'HTTP/1.1',
                 '_content' => 'ID list is empty! In it there are neither IDs nor accessions. ',
                 '_msg' => 'Bad Request',
                 '_request' => bless( {
                                        '_headers' => bless( {
                                                               '::std_case' => {
                                                                                 'if-ssl-cert-subject' => 'If-SSL-Cert-Subject'
                                                                               },
                                                               'user-agent' => 'libwww-perl/6.36',
                                                               'content-type' => 'application/x-www-form-urlencoded'
                                                             }, 'HTTP::Headers' ),
                                        '_method' => 'POST',
                                        '_uri_canonical' => bless( do{\(my $o = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi')}, 'URI::https' ),
                                        '_content' => 'db=sra&id=SAMD00028077&rettype=fastq&retmode=text&edirect_os=linux&edirect=11.2&tool=edirect&email=edthompson@magnolia01',
                                        '_uri' => $VAR1->{'_request'}{'_uri_canonical'}
                                      }, 'HTTP::Request' ),
                 '_headers' => bless( {
                                        'x-xss-protection' => '1; mode=block',
                                        'client-ssl-cert-subject' => '/C=US/ST=Maryland/L=Bethesda/O=National Library of Medicine/OU=National Center for Biotechnology Information/CN=*.ncbi.nlm.nih.gov',
                                        'server' => 'Finatra',
                                        'strict-transport-security' => 'max-age=31536000; includeSubDomains; preload',
                                        'ncbi-phid' => '322CB55B42D4D2E500002805E9D8639D.1.1.m_2',
                                        'content-type' => 'text/plain; charset=UTF-8',
                                        'client-date' => 'Wed, 24 Apr 2019 15:57:41 GMT',
                                        '::std_case' => {
                                                          'client-response-num' => 'Client-Response-Num',
                                                          'x-ua-compatible' => 'X-UA-Compatible',
                                                          'access-control-allow-origin' => 'Access-Control-Allow-Origin',
                                                          'ncbi-sid' => 'NCBI-SID',
                                                          'client-ssl-cipher' => 'Client-SSL-Cipher',
                                                          'x-ratelimit-remaining' => 'X-RateLimit-Remaining',
                                                          'client-ssl-socket-class' => 'Client-SSL-Socket-Class',
                                                          'content-security-policy' => 'Content-Security-Policy',
                                                          'l5d-success-class' => 'L5d-Success-Class',
                                                          'x-ratelimit-limit' => 'X-RateLimit-Limit',
                                                          'client-peer' => 'Client-Peer',
                                                          'client-transfer-encoding' => 'Client-Transfer-Encoding',
                                                          'set-cookie' => 'Set-Cookie',
                                                          'client-ssl-cert-issuer' => 'Client-SSL-Cert-Issuer',
                                                          'ncbi-phid' => 'NCBI-PHID',
                                                          'strict-transport-security' => 'Strict-Transport-Security',
                                                          'client-date' => 'Client-Date',
                                                          'x-xss-protection' => 'X-XSS-Protection',
                                                          'client-ssl-cert-subject' => 'Client-SSL-Cert-Subject'
                                                        },
                                        'client-ssl-cert-issuer' => '/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert SHA2 High Assurance Server CA',
                                        'vary' => 'Accept-Encoding',
                                        'set-cookie' => 'ncbi_sid=0BF58CE0F7076DE0_99BASID; domain=.nih.gov; path=/; expires=Fri, 24 Apr 2020 15:57:40 GMT',
                                        'client-transfer-encoding' => [
                                                                        'chunked'
                                                                      ],
                                        'client-peer' => '130.14.29.110:443',
                                        'cache-control' => 'private',
                                        'date' => 'Wed, 24 Apr 2019 15:57:40 GMT',
                                        'x-ratelimit-limit' => '3',
                                        'l5d-success-class' => '1.0',
                                        'content-security-policy' => 'upgrade-insecure-requests',
                                        'client-ssl-socket-class' => 'IO::Socket::SSL',
                                        'x-ratelimit-remaining' => '2',
                                        'client-ssl-cipher' => 'ECDHE-RSA-AES256-GCM-SHA384',
                                        'connection' => 'close',
                                        'ncbi-sid' => '0BF58CE0F7076DE0_99BASID',
                                        'access-control-allow-origin' => '*',
                                        'client-response-num' => 1,
                                        'x-ua-compatible' => 'IE=Edge'
                                      }, 'HTTP::Headers' )
               }, 'HTTP::Response' );
efetch RNA-Seq Entrez-direct • 3.5k views
ADD COMMENT
1
Entering edit mode
5.6 years ago
GenoMax 147k

I don't believe that you can use EntrezDirect to fetch sequences from SRA. If you look at help for efetch you will see that only options supported for SRA are following:

 sra
                 native             xml      EXPERIMENT_PACKAGE_SET XML
                 runinfo            xml      SraRunInfo XML

You can do stuff like this:

$ esearch -db sra -query SAMD00028077 | efetch -format runinfo -mode xml

<SraRunInfo>
<Row>
<Run>DRR032680</Run>
<ReleaseDate>2017-09-20 06:08:37</ReleaseDate>
<LoadDate>2017-09-20 07:05:56</LoadDate>
<spots>57063175</spots>
<bases>5763380675</bases>
<spots_with_mates>0</spots_with_mates>
<avgLength>101</avgLength>
<size_MB>3437</size_MB>
<download_path>https://sra-download.ncbi.nlm.nih.gov/traces/dra4/DRR/000031/DRR032680</download_path>
<Experiment>DRX029486</Experiment>
<LibraryName>Bf_blastula_2</LibraryName>
<LibraryStrategy>RNA-Seq</LibraryStrategy>
<LibrarySelection>other</LibrarySelection>
<LibrarySource>TRANSCRIPTOMIC</LibrarySource>
<LibraryLayout>SINGLE</LibraryLayout>
<InsertSize>0</InsertSize>
<InsertDev>0</InsertDev>
<Platform>ILLUMINA</Platform>
<Model>Illumina HiSeq 2000</Model>
<SRAStudy>DRP003810</SRAStudy>
<BioProject>PRJDB3785</BioProject>
<ProjectID>407062</ProjectID>
<Sample>DRS049885</Sample>
<BioSample>SAMD00028077</BioSample>
<SampleType>simple</SampleType>
<TaxID>7739</TaxID>
<ScientificName>Branchiostoma floridae</ScientificName>
<SampleName>SAMD00028077</SampleName>
<Sex>male, female, and mixed</Sex>
<Tumor>no</Tumor>
<CenterName>UT-BS</CenterName>
<Submission>DRA003460</Submission>
<Consent>public</Consent>
<RunHash>00028A5454C28D3B103C3772C0B485A7</RunHash>
<ReadHash>9A87D0D38C636BC9F5D2E527B7FCB677</ReadHash>
</Row>

</SraRunInfo>

I suggest you use Phil Ewel's SRA-Explorer and get the command lines you need to download data from ENA/SRA.

Edit: You could do the following:

$ esearch -db sra -query SAMD00028077 | efetch -format runinfo -mode xml | xtract -pattern SraRunInfo -element download_path
https://sra-download.ncbi.nlm.nih.gov/traces/dra4/DRR/000031/DRR032680

And then use a wget/curl call and save the .sra file. You can then fastq-dump the sequence data.

ADD COMMENT
0
Entering edit mode

if I recal I have spent the last 3 hours trying to install fastq-dump or the sratoolkit but its not been succesful. I use a mac. But I am trying to work on magnolia (cloud) which appears to be a linux system and I have not been able to get fastq-dump on it to perfom the last operation. Help please.

ADD REPLY
0
Entering edit mode

Whenever possible get the fastq files directly from EBI-ENA. SRA-Explorer tool will give you URL's for FTP from EBI-ENA.

For example: ftp://ftp.sra.ebi.ac.uk/vol1/fastq/DRR032/DRR032680/DRR032680.fastq.gz

You could construct them yourself by comparing this URL with the download location above.

ADD REPLY
0
Entering edit mode

I can download it but the thing is I have many of the sra files to download which is the reason why I want to do it on the cloud (server for my school). I was hopping I could download them directly into my folder in the server then work right on the cloud to save me storage and get better speed.

ADD REPLY
0
Entering edit mode

And you can. Just use the ftp URL's from your cloud account so the data gets downloaded directly there. You can search and build a full list with SRA-Explorer in creative ways.

ADD REPLY
0
Entering edit mode

Ok thankss alot. I will do this. I appreciate you for your time and patience.

ADD REPLY

Login before adding your answer.

Traffic: 2144 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6