Hello,
I've been running bwa mem to do pair wise alignment to a reference. I have already indexed my reference and it resulted in 5 output files with the following extensions: .pac, .sa, .amb, .ann, .bwt. I have run bwa men with 60G multiple times now and all the jobs never finish. They take over 20 hours and often time out at 24 hours. I have run it as:
bwa mem -t 24 ref read1.fq.gz read2.fq.gz > output.sam
bwa mem -1 -t 1 ref read1.fq.gz read2.fq.gz > output.sam
bwa mem ref read1.fq.gz read2.fq.gz > output.sam
Regardless of how I run it or with which parameters standard error files end with the following:
[M::mem_pestat] low and high boundaries for proper pairs: (1, 834)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 66668 reads in 251.152 CPU sec, 251.145 real sec
OR
[M::mem_process_seqs] Processed 66668 reads in 187.866 CPU sec, 187.859 real sec
[M::process] read 66668 sequences (10000200 bp)..
I'm confused on what this error is? Why doesn't the job ever finish? When I google it no one else has this issue.
Any constructive help is great!! I'm honestly fed up with bwa, it seems I'm not the only one that experiences this problem. It's funny how it is so popular when it doesn't work. Are there any other platforms that you would suggest?
Hello joomi ,
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
Hello,
I should have mentioned that I already redirect to a sam file! Sorry about that. My main issue is that all my scripts are timing out. I've been running them for 24 hours and they end up stuck at the point in the standard error files. I'm not sure why it is so slow. Is this normal? How many days should I let it run?
Thank you for your help!!
Any error at the end of the log/std error?
Ps: Please move this post to a comment to Fin's post.
Hello,
I don't have a log file. I have an standard output file and standard error file. The standard output looks like this:
and the standard error looks like:
Am I missing something? I'm sorry if this is a stupid question but I don't think I have a log file, just a .err and .out file.
Just to get sure: What version of
bwa
are you using?You should test with only a smal subset of your reads. For example 1000 reads. You can create such a subset by:
See what happens then.
fin swimmer
Thank you so much! I subsetted the two reads with zcat and ran bwa. It completed within minutes! My original files are around 7.4G in size and contain around 21959218 reads. This is for the species Zea Mays and in all of my runs I have used around 60G which is 64G/63g is the maximum that I have access to. I've run it with -t 24 as an batch job with 24 cores before and it still gets stuck. Is my best bet to split the read files into two halves equally and run subset1_1 with subset1_2 and subset2_1 with subset2_2? Then concat the sam file together? I can do that but I have 108 of these to do.
is this a reference based assembly? joomi
Please use
Add Reply
for comments or directly include them into the question by usingedit
. Otherwise the thread becomes messy pretty soon.