Hi,
I got sequencing data from a haploid genome and created consensus sequences using bcftools as described at http://samtools.github.io/bcftools/howtos/consensus-sequence.html, i.e.,
variant calling -> InDel normalization and filtering -> consensus calling.
I thought, consensus will create a sequence based on some kind of majority vote, however, it does not seem to do so. Instead I get a mutation wherever there were alternative bases found, regardless of their percentage.
1) Does anybody know, if there is a parameter of consensus for setting a threshold (e.g., 50% of reads) that must be altered to get only mutations found frequently among the reads?
2) if there is no such parameter, do you think it would be legit to exclude mutations found in less than 50% of the reads after the variant calling process by using bcftools view -e '(DP4[2]+DP4[3]/sum(DP4))<0.5'