Hi,
I want to download 16S sequences (if available) of thousands of organisms. I use following command:
esearch -db nucleotide -query "Abacoproeces saltuum [ORGN] AND 16S" | efetch -format fasta > Abacoproeces_saltuum.txt
esearch -db nucleotide -query "Abax carinatus [ORGN] AND 16S" | efetch -format fasta > Abax_carinatus.txt
...
and this for ~40 000 species. The format for all species is the same (as shown above).
The command starts running and files are generated. Some contain data and some do not (if sequence is not available).
After searching some (~20-30) species I receive the error below.
However, the process continues and new files are being downloaded/generated even after the error. Also files with data/sequences are generated. Also the error appears to return multiple times.
Other forum topics regarding this error do not show this behavior, therefore I open this new post regarding this topic.
My questions: what causes this problem? Somewhere in the error I can read " Too Many Requests", but as I execute it as a bash script, each line should be read seperately. I therefore cannot imagine that the ~40 000 entries (and thus ~40 000 lines) should be the problem. Also I don't think it is a problem regarding the installation of esearch/efetch/entrez
as some searches work perfectly.
Can I expect that the files that are generated are correct? If no data is generated in a file (0 kb file) can I assume that there was really no sequence of that species available, or can the file be empty because of the error?
Thanks.
PC: Ubuntu 18.04.3 LTS
'_headers' => bless( {
'content-security-policy' => 'upgrade-insecure-requests',
'user-agent' => 'libwww-perl/6.31',
'retry-after' => '2',
'content-type' => 'application/json',
'server' => 'Finatra',
'x-ratelimit-remaining' => 'X-RateLimit-Remaining',
'client-ssl-socket-class' => 'IO::Socket::SSL',
'client-response-num' => 'Client-Response-Num',
'client-response-num' => 'Client-Response-Num',
'if-ssl-cert-subject' => 'If-SSL-Cert-Subject'
'_headers' => bless( {
'if-ssl-cert-subject' => 'If-SSL-Cert-Subject'
'content-type' => 'application/json',
'client-ssl-socket-class' => 'IO::Socket::SSL',
'client-ssl-cipher' => 'Client-SSL-Cipher',
'content-security-policy' => 'Content-Security-Policy',
'content-type' => 'application/json',
'vary' => 'Accept-Encoding',
'retry-after' => '2',
'content-type' => 'application/json',
'if-ssl-cert-subject' => 'If-SSL-Cert-Subject'
'::std_case' => {
'_uri' => $VAR1->{'_request'}{'_uri_canonical'},
'_headers' => bless( {
'::std_case' => {
'if-ssl-cert-subject' => 'If-SSL-Cert-Subject'
},
'content-type' => 'application/x-www-form-urlencoded',
'user-agent' => 'libwww-perl/6.31'
}, 'HTTP::Headers' )
}, 'HTTP::Request' ),
'_content' => '{"error":"API rate limit exceeded","api-key":"2a02:a212:1880:6a00:b89d:8514:d405:3f62","count":"4","limit":"3"}
',
'_msg' => 'Too Many Requests',
'_rc' => 429,
'_headers' => bless( {
'content-length' => '112',
'server' => 'Finatra',
'client-ssl-cert-subject' => '/C=US/ST=Maryland/L=Bethesda/O=National Library of Medicine/OU=National Center for Biotechnology Information/CN=*.ncbi.nlm.nih.gov',
'client-date' => 'Tue, 27 Oct 2020 12:12:21 GMT',
'client-ssl-socket-class' => 'IO::Socket::SSL',
'content-security-policy' => 'upgrade-insecure-requests',
'date' => 'Tue, 27 Oct 2020 12:12:19 GMT',
'x-ratelimit-limit' => '3',
'x-ua-compatible' => 'IE=Edge',
'::std_case' => {
'x-ua-compatible' => 'X-UA-Compatible',
'x-ratelimit-limit' => 'X-RateLimit-Limit',
'client-ssl-cert-issuer' => 'Client-SSL-Cert-Issuer',
'strict-transport-security' => 'Strict-Transport-Security',
'client-peer' => 'Client-Peer',
'client-response-num' => 'Client-Response-Num',
'client-date' => 'Client-Date',
'client-ssl-socket-class' => 'Client-SSL-Socket-Class',
'content-security-policy' => 'Content-Security-Policy',
'x-ratelimit-remaining' => 'X-RateLimit-Remaining',
'client-ssl-cert-subject' => 'Client-SSL-Cert-Subject',
'client-ssl-cipher' => 'Client-SSL-Cipher',
'x-xss-protection' => 'X-XSS-Protection'
},
'client-peer' => '2607:f220:41e:4290::110:443',
'strict-transport-security' => 'max-age=31536000; includeSubDomains; preload',
'client-response-num' => 1,
'vary' => 'Accept-Encoding',
'x-ratelimit-remaining' => '0',
'x-xss-protection' => '1; mode=block',
'client-ssl-cipher' => 'ECDHE-RSA-AES256-GCM-SHA384',
'connection' => 'close',
'content-type' => 'application/json',
'retry-after' => '2',
'client-ssl-cert-issuer' => '/C=US/O=DigiCert Inc/OU=www.digicert.com/CN=DigiCert SHA2 High Assurance Server CA'
}, 'HTTP::Headers' )
}, 'HTTP::Response' );
WebEnv value not found in search output - WebEnv1
Db value not found in fetch input
Thank you for mentioning the API key I signed up and created an API key. I add the api key parameter like this:
esearch -db nucleotide -query "Abacoproeces saltuum [ORGN] AND 16S" | efetch -api_key="MY_API_KEY" -format fasta > Abacoproeces_saltuum.txt
Then however, I still get the error and that is states that my count limit is 3, while it should be 10. If I enter a random wrong API key, I do get the error "incorrect API key", so it appears that the parameter is correctly added. Any ideas?Export the key as a variable in your shell
export NCBI_API_KEY=your key
.I now added the parameter "-api_key="my api key" in the search command and I exported it as variable in my shell and now I do not receive the error anymore. Thank you for your help!