I have a Python script that runs BioPython's Web Blast function. We're using large fasta files, so the script breaks these files into smaller files and then blasts them. Below is a partial query:
>gi|73485745|gb|AAJJ01000902.1|_30
CCATCTGGCCTGACCCAGATCGGCCTTTTATGGCATACTCGTACCGTAATAAA
The frustrating part is that some of the smaller files work and some don't, even though they appear to be the exact same format. Plus the failed file works when going through the NCBI Blast web page. Below is the error message I get when run from my script:
ValueError: Error message from NCBI: Cannot accept request, error code: 1
According to NCBI, an error code of 1 means bad query sequences or BLAST options. This is the function:
result_handle = NCBIWWW.qblast("blastn", "nr", fasta_string, megablast=MEGA_BLAST)
where MEGA_BLAST
is a boolean. Anyone have any idea why it would fail? The input string, as far as I can tell, is fine. I have no idea why this is occuring.
UPDATE: This is a file that failed.
You should post one of the smaller files that fail to a location that we can download it from
So where did you find that command line? blastn doesn't work with nr.
You're wrong Michael, blastn DOES work with nr - see below. It is probably treated as an alias for nt, given the NCBI refer to it a "Nucleotide collection (nt/nr)" on the BLASTN website. It is surprising through as NR normally means the protein database.
You're wrong Michael, QBLAST with "blastn" DOES work with nr - see below. It is probably treated as an alias for nt, given the NCBI refer to it a "Nucleotide collection (nt/nr)" on the BLASTN website. It is surprising through as NR normally means the protein database.
I've updated the Biopython Tutorial to use "nt" rather than "nr" to avoid the confusion - thanks for flagging this Michael: https://github.com/biopython/biopython/commit/60fed13c350ab8e3f2e79b69d490b0701a1b2540