Entering edit mode
3.5 years ago
langziv
▴
70
Hi.
I ran this script before and it worked fine. Maybe there's a small change that causes this. Also, there's no error message.
Here's the script:
#!/bin/bash
#PBS -N blastx
#PBS -e /err_and_out_files/blastx.ER
#PBS -o /err_and_out_files/blastx.OU
#PBS -l nodes=compute-0-311:ppn=20,mem=100gb
export BLASTDB="/bioseq/biodb/BLAST/Proteins2/taxdb"
module load blast/blast-2.10.0
blastx -query /output/fasta_files/btcaA1_filtered.fa -db /bioseq/biodb/BLAST/Proteins2/nr -max_hsps 1 -max_target_seqs 10 -num_threads 4 -evalue 1e-5 -out /output/blast/blastx/btcaA1_filtered.txt -outfmt "6 qseqid sseqid pident staxids sskingdoms qstart qend qlen length sstart send slen evalue mismatch gapopen bitscore stitle"
Thanks!
Just realized that a near-identical question of yours was already answered here. Since that answer was accepted, I assumed that it solved your problem.
Thank you Mensur Dlakic. Since you mentioned SwissProt, would you use both NCBI's database and SwissProt for XBLAST? Maybe it's a good idea to have multiple databases, in case they are trustworthy.
SwissProt is a curated database that includes protein of known function and reliable annotation. It has less than million sequences if I remember correctly, and it is not meant for large scale searching. Besides, all of its sequences are already included in the
nr
database. I suggested it to you as a quick way of checking whether your software and hardware setup is correct, because the search should be done in less than 1% of time it takes to donr
. UniProt90, on the other hand, is a good substitute fornr
in my opinion, and is about 40% of thenr
size.From your experience, is it normal that blastx run would last multiple days when running against blast's proteins database, and the input fasta file consists of a single sequence, the length of which is 10,368 base pairs, while there's no output written, or is that indicative of something not working?
Please do not delete posts that have received feedback.
I though everything was explained in my previous answer, but I will try again.
You seem to be using a shared computer and running this through some kind of batch submission system. It is not normal for a blastx run on a single sequence to take multiple days, but it could be that your system is slowly reading the database because of swapping, or because of high load. Or it could be that something is wrong with your programs and/or database setup. That is why I suggested that you try SwissProt because it is a small fraction of the
nr
database. If a search against SwissProt is not done in a matter of minutes, it would hopefully tell you is it a matter of a slow computer system or a wrong software setup.