Hi,
I am trying to blast a fasta file of protein sequences against the non-redundant database on a HPC. I run the following command:
cat prot/split_fasta/master.dataframe.tide-tandem.protein.part_001.fa | parallel --GNU --block 100k --recstart '>' --pipe '/home/users/nus/e0470749/ncbi-blast-2.8.1+/bin/blastp -query - -db nr -outfmt "6 std slen qlen stitle staxids sscinames" -max_target_seqs 500 -num_threads 12 -evalue 0.001' > seps_nr_out_001.txt
However, the job gets terminated with Exit status: 1. I thought that this was a memory issue based on previous posts with the same error. Hence, I tried to break my original FASTA file (10,000 sequence) into smaller parts. The current file contains ~ 100 sequences now. I also run the job with 1 TB of memory which seems to be sufficient based on the usage report:
Resource Usage on 2021-11-08 11:52:18.892810:
JobId: 6845745.wlm01
Project: personal
Exit Status: 1
NCPUs Requested: 12 NCPUs Used: 12
CPU Time Used: 11:50:04
Memory Requested: 1tb Memory Used: 159785036kb
Vmem Used: 266577592kb
Walltime requested: 12:00:00 Walltime Used: 01:39:10
Execution Nodes Used: (lmn2609:mem=1073741824kb:ncpus=12)
The Blast database also seems to be normal. Running ~/ncbi-blast-2.8.1+/bin/blastdbcmd -info -db blastdb/nr
gives:
Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 436,338,278 sequences; 161,860,501,762 total residues
Is there any thing else I can try to solve this? The only other thing I can think of is to downgrade BLAST.
Thanks for your help. Yes, I am using the pre-formatted nr database (v. 5). I am using a slightly older Blast+ (v. 2.8.1) as the hpc server I am working on has an outdated GLIBC. When I use the latest blast by running
./ncbi-blast-2.12.0+/bin/blast+
, I get this error:I will try using an older nr database (v. 4) in this case.
These are the first few lines of my FASTA file:
Unless you create a new version of v.4 indexes yourself
nr
old database version that you can download from NCBI is frozen as of Feb 2020. Keep that in mind.You could try a small subset of the fasta you have and see if you get that error. If you do then you may want to remove
(+1)
etc from the fasta headers and see if that eliminates the error.