Hello everyone,
I'm currently trying to use the samtools filtering expression as follows:
samtools view exemple.bam --input-fmt -filter="mapq >= 60" > tested.sam
Unfortunately, this doesn't seem to work properly as my output sam still contains alignments with mapq < 60. I think my use of this filtering option is wrong. But I thought this could work according to the following documentation: http://www.htslib.org/doc/samtools.html#FILTER_EXPRESSIONS However I couldn't find any exemple of a proper use of this method.
My ultimate goal being to achieve something like this:
samtools view -h input.bam --input-fmt-option "filter=mapq >= 5 && [MM]/qlen > 0.3 && [NM]/[MM] > 0.3" > output.sam
Thanks for the help
My answer is below, but as a comment, what is the MM tag and what tool produces it? This will clash with the base modification tag once it lands as an official SAM tag.
If you have private unofficial tags, they should be lowercase or start with X, Y or Z.
So I think the [MM] tag was to be used for counting the number of match and [NM] the number of mismatch... but I suppose I am wrong
Atleast for your first query, samtools view has the -q option which means (from its help)
-q INT - Skip alignments with MAPQ smaller than INT [0]
So, you can simply use:
This will remove reads less than MAPQ 60.
Not sure that you keep those reads with a Q value <= 60, or the opposite by using this -q option. In fact, the samtools view help indicates the following (using latest 1.12 version)
It means the same. Only the description of what -q means has changed with versions (english language vise). But essentially, you discard reads which have a MAPQ of less than 60.
yeah it's true there are ways to work arround this, but my intend is to go further and do more complex things starting from here