Hi, I am running BWA mem for aligning my PE reads against Human reference genome (GRCh38) although it is running but I am encountering with an error message. Please have a look on the command and the resulting message:
/usr/local/bwa-0.7.12/bwa mem -t 14 \
-M /san/illumina_two/rsindhu_sge/Human_ref_genomes/GRCh38/FINAL/GRCh38.fa \
Read1.fastq.gz Read2.fastq.gz > Read12.bwa.sam
A part of message:
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 1429588 sequences (140000077 bp)...
[M::process] read 1431216 sequences (140000054 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (19, 558476, 545, 5)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (126, 180, 401)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 951)
[M::mem_pestat] mean and std.dev: (243.78, 179.56)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1226)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (334, 387, 448)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (106, 676)
[M::mem_pestat] mean and std.dev: (391.91, 87.32)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 790)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (24, 49, 80)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 192)
[M::mem_pestat] mean and std.dev: (54.23, 40.99)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 248)
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_process_seqs] Processed 1429588 reads in 1170.454 CPU sec, 83.775 real sec
[M::process] read 1429172 sequences (140000054 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (15, 558767, 537, 10)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (100, 207, 290)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 670)
[M::mem_pestat] mean and std.dev: (212.33, 136.90)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 860)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (332, 385, 446)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (104, 674)
[M::mem_pestat] mean and std.dev: (390.32, 87.22)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 788)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (24, 46, 86)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 210)
[M::mem_pestat] mean and std.dev: (56.51, 44.37)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 272)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (379, 1210, 2961)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 8125)
[M::mem_pestat] mean and std.dev: (1898.40, 2191.92)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 10707)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
..........
Also, The size of the .sam file was found to be small as compared to the file when there was no such error message (Earlier it was 34G and now 14G). Please provide your suggestions
Many Thanks
Ravi
What error message? All I see is the typical bwa status information.
Thank you for the response Devon and Pierre. I was assuming them as error messages as it seemed something different that the BWA run which i ran few days ago. Anyway, thanks for clearing that these are normal run log messages. Moreover, it is alignment of 2nd lane of one sample of NGS data (PE of human sample with ~30X coverage) with initial size of input 2 files (Read1.fastq.gz & Read2.fastq.gz as 3.1 G each), out of total 8 lanes. The output file .sam formed of 14G.
Since Lane 1 input files were of same size (3.1G) but the output .sam file formed of 34G so i got confused that something is wrong with my run. I still have to find out why this much difference in size of output sam files of these two lanes alignment files so i'll re-align my lane 1 data again. Also, i am naive in this field so i am seeking help for clarifying my doubts. Thank you, Ravi.
Presumably the lane with a larger SAM file either had a higher mapping rate or more multimapped and/or chimeric alignments.
Please let me know where and how i can check this information about 'mapping rate/multimapped or chimeric alignments'. Thanks
Give the number of reads in the input FASTQ and the number of lines in the output SAM.
Hi, here is the information
Lane 1 read count and information
Lane 2 read count and information
Since, it gave me error that this file may be incomplete/truncated so I rerun it and found that the resulting file is similar to lane 1 output size (approx 34G), here is it's information
So, finally I got answers of both my queries. Ist one was not an error, it was regular run messages. 2nd, since BWA run was incomplete or truncated .sam formation resulted into the small sized file (14G in comparison of 34G). Thank you guys for all your comments.
Regards,
Ravi
More recent versions of BWA print total CPU and wall-clock time at the end of the run. If you don't see it, the output is very likely to be incomplete.
Yes, I noticed it in few runs. Thank you for your comment, I appreciate it . Best wishes, Ravi.
I met the same issue as you.So how do you deal with this error?It is the memory no enough to compute or the reference is not complete or any other reason?Thanks !
Hi,
I have this problem too.do you find any solution for this problem?
Best Regards, Mohammad