Hi there, I have managed to query the nr database with the BLAST REST API using code from their web_blast perl script, but i have not figured out how to query the nr_clustered database
Does anyone know what parameters can be used to query nr_clustered via the API?
the web_blast perl script is here
https://blast.ncbi.nlm.nih.gov/docs/web_blast.pl
Just changing out nr for nr_clustered in the command line args does not work
thanks for the cross reference. I am still debugging but I might have actually gotten the API to return clustered results using the database name nr_cluster_seq. I will post an update if I can confirm
Can confirm that database name you discovered is working with command line remote blast+ (v.2.15). Of course if you have many/large queries ....
Note: It is producing results that look like "normal"
nr
blast though i.e. single fasta header lines in hits. Result does have the followingya that is something i was checking into, it seemed like the results didn't really display the 'clustered' type results that the web ui shows, but will keep looking
I confirmed that the results of search against plain
nr
are different and clearly show that database being used.From the clustered search on web (see below), the command line blast output seems to be selecting the
top
hit as shown in the clustered database (yellow highlight below) in results we get.My assumption is
blast+
code (released to public) does not have the necessary bits to show clustered headers/results. This matches the output we currently get from the command line search againstclustered_nr
.Clicking on the
Download
button in web clustered blast allows one to download clustered output. I assume this format will make it into publicblast+
code when NCBI is ready to releaseclusterd_nr
as a public download.Note: Only the sequence of the top "hit" is shown in the alignment though you can see members of the cluster in a separate section.