I am new on bioinformatics, and I am not biologist. I have a basic questions about sequencing/alignment and variant calling. For example in humans, I understand that the DNA mutates all time, so during the process of sequencing, for one specific locus, in theory we only expect two variations (diploid) but they can be more because sequencing errors, indels, dels etc (these, are discarded during the variant calling process).
It can occurs that the human dna contain more than 2 alleles because some cells contain a mutation/variation and others not? In an haploid case, can we expect alleles?
When we obtain the 'consensus' from an alignment of all the reads from a single individual. It takes the most common variations for each position? So for example in a diploid organism, with one single consensus we loose a lot of information (all the alleles,in case they are heterozygotic)
Thanks!
Yes but most of case it is mosaic cells (sub-population of cells) which have low coverage.
I most of case the information is contained in SNP database about the organism you are studying.