Hello,
I have trouble grasping certain concepts while analysing VCF file. I need to find all transitions. I have REF A ALT G and 1/1 genotype, so I would say that the person has GG genotype and that 2 transitions happened (2x A->G). Correct me if I'm wrong.
However, when I think about this more I realise that I don't understand certain things (a lot of things :( ) :
Does this mean that REF genome has A on fw chain and T on another, while sequenced sample has G on one and C on another (for both chromosomes)? How can I now what is the reference for the another chromosome in pair?
How can reference genome contain information regarding only one chromosome? What about alleles? If we know that gene is "defined" by combination of alleles, what this information in reference genome actually tells us?
Let's be very visual - I isolate DNA from a certain person and I am interested in a certain gene, let's call it gene X. That gene comes in two copies: one from mom's and one from dad's chromosome, both having 2 chains (4 chains overall). What do we sequence here during paired-end sequencing? Everything, fw and rev strand from both chromosomes?
What exactly are reads that I am aligning? Do I just take all reads of fw strands and then align it to reference genome? Or I align both fw and rev?
tea.vuki
Why did you delete the post?