Blast job never stops
1
0
Entering edit mode
4.0 years ago
langziv ▴ 70

Hi.

Yesterday I started running a job with a script for blast. An output file was created. now, almost 24 hours later, the file is still empty, no other files were created, and the job keeps running. I think it won't stop unless I'll stop it. I was told that it might be due to a contig that's too long.

Maybe it's important to note that initially I couldn't keep the job running, since it kept failing due to memory limit exceeding. To overcome this I set a higher limit of 100gb. That's the highest I ever had to set.

The script:

#!/bin/bash
#PBS -q ...
#PBS -N ...
#PBS -e ...
#PBS -o ...
#PBS -l nodes=1:ppn=20,mem=100gb

module load blast/blast-2.10.0
cd /some/path/

blastx -query A1/scaffold.fa \
-db /root/BLAST/Proteins2/nr \
-max_hsps 1 -max_target_seqs 1 -num_threads 20 \
-out just_trying.txt \
-outfmt "6 std staxids qseqid sseqid staxids sscinames scomnames stitle"

Does anyone have an idea what to do?

blast linux • 895 views
ADD COMMENT
1
Entering edit mode

not directly related to your issue but a few points to be aware of:

  • be caution (or at least know very well what it implies) with using parameters as -max_hsps 1 and/or -max_target_seqs 1 , they can cause 'unexpected' results (google for it for details)

  • Also: using up to 20 threads will likely not give much speed increase, blast is only for a small part parallelised and with 20 you for sure are on the plateau of speed increase (it has been said that anything above 4-5 threads is likely not adding much)

ADD REPLY
0
Entering edit mode

Give us more info or flip a coin.

ADD REPLY
0
Entering edit mode

I added the job's script.

ADD REPLY
0
Entering edit mode

Log onto the cluster node your job has been allocated and check what's happening there (e.g. with top).

ADD REPLY
2
Entering edit mode
4.0 years ago
Mensur Dlakic ★ 28k

I already answered this question in the other thread you started.

Files are written in chunks that correspond to sector sizes. BLAST will not have written enough output to go over a sector size until it completes a search with at least one sequence, as hits for each sequence are written at the end. As long as the search is still ongoing with the very first query sequences (assuming you have more than one), it is possible that there will be no output in that file.

There could be multiple explanations: your computer is slow (either objectively, or because it is shared with many other people); 100 Gb is still not enough so there is lots of disk swapping, which makes everything slow; you have a long query sequence that takes a while to search; all of the above.

I suggest you take a shorter query sequence and a smaller database, and run a test to make sure that everything works. Assuming it does, you may need more than 100Gb of memory assigned, a smaller database, or a faster computer. If all of that fails, you will need more patience.

ADD COMMENT

Login before adding your answer.

Traffic: 1336 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6