Suppose the following:
- there exists gene X on chromosome Y. The wild type allele is 'A' and the recessive allele is 'a'
- Allele 'a' is caused by a mutation in base T by substituting it with base A at position #12897
- I sequenced a human genome which has the alleles 'A' and 'a' for gene X
- In the reference genome, the allele present for gene X is 'A'
a- First, in the reference genome, do the two strands of the chromosomes exist ? so would we have two strands of chromosome Y and then have two alleles for gene X on each chromosome strand ? or is it just one strand?
b- so when mapping the reads to the reference genome, we would have (theoretically, and as an example) around 100 reads having nucleotide T at position #12897 and 100 reads having base A at position #12897, those represents the two alleles for gene X, is that correct ?
c- Speaking about SNPs, we consider a certain base to be a SNP if this base is different from the associated base in the same position in the reference genome. But what if this base in the sequenced genome is not a SNP, and the one in the reference genome should be the SNP ? is this possible? hope I was clear in this.