Entering edit mode
8.2 years ago
QVH
▴
10
Hello,
I want to blast some sequences on dozens of huge SRA datasets. And, in order to develop a pipeline, I tried to send different requests from my computer on python. Everything is working fine with the standard parameters (database = nr, for example), but I don't manage to make it work when I want to use a SRA dataset as a Database (DRX029854, for example, with the BLAST_SPEC:SRA parameter).
fasta_file = 'BlastList.fasta'.strip()
database_num = str('DRX029854')
url = ('http://blast.ncbi.nlm.nih.gov/Blast.cgi')
args = {'CMD':'Put','DATABASE':database_num,'PROGRAM':'tblastn','BLAST_SPEC':'SRA',
'FORMAT_TYPE':'XML','MAX_NUM_SEQ':'20000'
,'WORD_SIZE':'6','FILTER':'F'}
req = requests.post(url,params=args,files={'QUERY': open(fasta_file, 'rb')})
Does someone know which API or any kind of specific parameter I have to use to make it work with SRA databases?
Thanks
Maybe take a look at this code: https://github.com/Kingsford-Group/sbtappendix/blob/master/srablast/srablast.py
It's the same kind of code I'm using, and unfortunately, it's not working. I get this error :
I think the problem could be due to the way the database is called.
Appears that the script is looking at the local file paths rather than web for the database.
Are you able to get this to work for some other acc #? Perhaps your local firewall settings are preventing you from going out to NCBI.
Also be aware that at the time I write this NCBI is testing
https
only access to their site. So you may want to replace that http with https (https://blast.ncbi.nlm.nih.gov/Blast.cgi). just in case.It's perfectly working when I try to blast something on a standard database (e.g. 'nr'), for exemple:
So, I really think the problem is due to the way I call the SRA database.
Do you know for sure that NCBI allows remote access to SRA database for blast?
I contacted NCBI in order to know whether this function has been removed.
Looks like it's not "officially supported" by NCBI, they recommend cloud implementation with Amazon.