I want to use Biopython's qblast() command to query NCBI's databases but I want to limit my search to specific organisms. Looking at the documentation for this command, I can guess that the "entrez_query" parameter might be helpful but I have not been able to find any information about what sort of value it expects. I tried providing a taxid value which did not work:
result_handle = NCBIWWW.qblast('blastp', 'nr', record.seq, entrez_query='(3702)')
result_handle = NCBIWWW.qblast('blastp', 'nr', record.seq, entrez_query='(taxid=3702)')
result_handle = NCBIWWW.qblast('blastp', 'nr', record.seq, entrez_query='(taxid:3702)')
Each time I got a error message explaining that this was an invalid entrez query but I'm not sure what a valid entrez query looks like. I have also tried looking for examples of this option being used but have not found any.
Any and all help would be appreciated.
Thank you so much! Any chance you know how to specify multiple organisms in a single query? Or how to specify that certain taxonomic groups should be excluded? Ultimately, I'm going to want to search within taxonomic groups while excluding specific species...
You can use boolean search, something like
"all [filter] NOT(environmental samples[organism] OR metagenomes[orgn]) AND txid3702[ORGN] AND txid9606[ORGN]"
This example exclude environmental/metagenomic sampels and include human in the search.Hi, I have tried AND as well as and to target a set of organisms but my blast output only contains the last txid[ORGN] instead of showing the results for all the organisms. Is there a way to maybe loop over multiple taxid organisms?