Entering edit mode
5.3 years ago
f1978hp
▴
30
I see a command line used to variant calling but i don't understand a particular point. The command line is below.
samtools mpileup -C50 -d10000 -L1500 -uf REF_FASTA -I BEDFILE --output-tagsDP,DP4 -r {} Sample.BAM
and so,
bcftools call -mv > sample.{}.vcf
The part that i don't understand is the parameter -C50
. The samtools documentation said:
-C, --adjust-MQ INT
Coefficient for downgrading mapping quality for reads containing excessive mismatches. Given a read with a phred-scaled probability q of being generated from the mapped position, the new mapping quality is about sqrt((INT-q)/INT)*INT. A zero value disables this functionality; if enabled, the recommended value for BWA is 50.
Why use this parameter to downgrading MQ if there isn't a filter step after it?
Thanks for clarification!
Hello f1978hp ,
as the documentation says, the mapping quality is downgraded for reads with excessive mismatches. Mapping quality is just one parameter for
bcftools
to decide if there is a variant or not. Decreasing the mapping the quality in this ways, leads to fewer false positive variant calls at all. In my experiences often there is no additional filter step needed.fin swimmer
BTW:
samtools mpileup
is deprecated and should be replaced bybcftools mpileup
.Hello fin swimmer!
But, how bcftools leads to fewer false positive variant calls based on decreasing the mapping quality? For example, if i had a read with excessive mismatches, the parameter -C 50 will decrease your MQ so at what point (argument) in this case bcftools will use the new value of MQ of this read to decide if there is a variant or not?
During variant calling
bcftools
compares the average mapping quality of reads supporting a variant with those they don't. If there is a significant difference it will skip this variant. As we decrease the mapping quality before this difference will be much higher for reads with excessive mismatches.BTW: There is no need to post your comment twice. I will delete one of them.
Thanks fin swimmer!
Do you know if there is a way to create a BAM file with these reads that skipped a variant because of a high difference in mapping quality? I would like to know better about this and appreciate a lot if you recommend a bibliography.
PS: Sorry about the duplication.