Entering edit mode
7.7 years ago
user230613
▴
380
I'm trying to retrieve the number of reads and the number of alignments with MAPQ=0.
To address the question I'm using samtools
in the following way:
samtools view -@ 4 BAM | awk '{if($5==0) {print $0}}' > mapq0.sam
cut -f1 mapq0.sam | sort -u | wc -l
1606759 # Number of reads
wc -l mapq0.sam
2952315 #Number of alignments
However, I'm getting different numbers when running Samstat
:
MAPQ < 3 30814.0
I think I'm missing something very basic. Moreover, I'd expect to have a greater number in Samstat MAPQ<3 since obviously it also considers, MAPQ=1,2...
Any hint?
Thank you
Thank you for your answer, but actually I'm looking for an explanation in the difference between samstat and samtools counts :).
MAPQ=0 indicates multi-mappers. Have you considered/accounted for those when comparing those two numbers?
Yes, I have consider that, that's why I'm counting read names using
cut -f1...
commandPlease could you move your answer to a comment? I want to keep the question opened and unanswered :)
Question never really closes. Even for questions where an answer has been "accepted" one can add new answers.
If you are referring this
samstat
program then I would be scpetical. It seems to not have been updated in 1.5+ years. If you need an alternate give Qualimap or Bamstat a try.Yes, I'm referring to that
samstat
, and although it has not been updated in 1.5 years, we are not talking about a very up to date (changing ) issue, is only counting the number of reads with MAPQ=0. Is not a big deal :/I've used
qualimap
, but as I said, I'm not looking for other tools to do the same, I'm looking for an explanation of the difference between samstat and samtools.There seems to be some sort of a bug in
samstat
. I tried your method above on a small bam file and got different counts withsamstats
. You could report this to the developer.