I am quite new to the bioinformatics. I read a paper on mycobacterium where the authors mentioned that " A consensus of more than 75% of reads was necessary to support high-confidence nucleotide variant calls made with SAMtools mpileup, which had to be homozygous in a diploid mode. Only variants supported by at least five reads, including one in each direction that did not occur at sites with unusual depth and were not within 12 base pairs of another nucleotide variant or indel, were accepted. "
"Uncalled sites where variants had been identified in other samples were manually inspected and nucleotides initially excluded because of excessive (>97ยท5th percentile) high-quality read depth were reinserted in an additional filtering step"
I have checked the samtools documentation as well the github but cant seem to understand it. Could someone explain this with an example? Thanks
Welcome to Biostars. Please edit the title and make it reflective of question you are asking in content above.
Since you are referring to a bacterium why should ploidy come into play?
In some conditions, replication and cell division are not tightly coupled, leading to diploid and polyploid states. Could be the OP is referring to a study involving this.
"I have checked ... but cant seem to understand it." Some clarification on what you can't understand sure would be nice (as @rfran010 mentioned).
Are you asking about why the authors used this strategy or asking how samtools mpileup works?
Why did you delete the post, sd ?
Following up - please add a reason why you're deleting this post repeatedly. If you delete this again without adding a reason, your account will be suspended.