Hi Everyone!
I used samtools Flagstats walker. I am getting 0.0% properly paired reads in the BAM file. I sorted my BAM file and marked the duplicates as well. is there any BAM FLAG is missing in my file or mapping results are bad quality. Note that my average mapping quality (MAPQ) is more than 30.. I used bwa-mem for mapping to reference genome. I did not mark any duplicates in this step.
Can you please help help me tracking this issue?
My command arguments were:
samtools flagstat /CHG000691/CHG000691_PE_02AL3.sam > /Flagstat/CHG000691/lanewise/CHG000691_PE_02AL3_flagstat.txt
Output:
75031887 + 0 in total (QC-passed reads + QC-failed reads)
733759 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
74460957 + 0 mapped (99.24%:-nan%)
74298128 + 0 paired in sequencing
37149064 + 0 read1
37149064 + 0 read2
0 + 0 properly paired (0.00%:-nan%)
73397388 + 0 with itself and mate mapped
329810 + 0 singletons (0.44%:-nan%)
2837608 + 0 with mate mapped to a different chr
982994 + 0 with mate mapped to a different chr (mapQ>=5)
Thank you very much!
It would be helpful if you could answer these questions -
You can validate that the names appear to be correctly paired with the BBMap package:
If the reads are interleaved in a single file, use the
verifyinterleaved
flag instead.In any case, it's odd that 0 reads are paired. I would expect there to be a few just by random chance out of 75 million (though that depends on the genome size). I'm guessing that bwa treated these as single-ended.
1. What was your bwa mem command?
I think it might be due to the
-P
option, but I am not sure, because it is written in BWA manual: "In the paired-end mode, perform SW to rescue missing hits only but do not try to find hits that fit a proper pair."2. What was the library type? (e.g. fragment, LMP [plus CLRS, Nextera, etc.])
Fragment-Sonicator
3. What was the expected insert size and orientation?
4. Are you sure the sequencing was paired?
Yes, I am sure the sequencing was paired-end. as you can see the flagstat output:
5. If you used 2 files (rather than interleaved), are you sure the two files go together?
Yes.. :)
6. What kind of preprocessing did you do? (For example, fastx toolkit is notorious for destroying pairing order and should never be used with paired reads, or ever, IMO).
No preprocessing was performed.
7. What kind of experiment is it? DNA, RNA, etc.
Whole Genome DNA Sequencing
8. What version of bwa are you using?
BWA-Version: 0.7.12-r1039
You could easily check if
-P
is the culprit, by repeating the alignment omitting it.Did you ever figure out why this was happening? I'm having the same issue and can't figure out why.
Can you please post the output of
head
for the two FASTQ files? If they're gzipped, you can do this by typingzcat [fastq_file] | head
@Dan D
here is the output:
First File:
Second file: