How do you distinguish between homo- or heterozygote alleles if you sequence directly for a gene from genomic DNA?
How do you distinguish between homo- or heterozygote alleles if you sequence directly for a gene from genomic DNA?
Hi, if you are talking about Sanger sequencing (and the like) you need to extract genomic DNA, amplify a portion of the gene and clone it. Then you should transform competent bacteria and screen some of the resulting colonies. If the sample is heterozygote for the gene, you will have two different alleles in your sequencing results; otherwise, you will systematically get the same result.
If you do not clone your PCR amplicon to a vector, a crude way to assess whether you have a heterozygote sample is by evaluating the electropherogram that is normally attached to the sequencing results. With the right experimental design, you should be able to see double peaks associated with the different alleles.
Thanks for the answer! And: sorry, I assumed that everyone is sequencing by a capillary electrophoresis sequencing based method if it is Sanger or NGS anyway and this would not make a difference. So, if I amplify the gene and the sample would be homozygote i get one (in a simplified assumption) spectral peak for a nucleotide and if i would see an overlay of two amplitudes (in a "stable" surrounding") i would assume that the correlating nucleotides come from two alleles?
It all depends on your pipeline I would say: typically you would use Sanger-related methods for low-throughput projects, with procedures alike those exemplified above. In NGS-based workflows you would more likely get reads from all alleles present in your initial sample.
In your shoes, probably I would follow the first of the two methodologies above. To simplify, key steps would be:
1 - Design PCR primers in such a way that you are reasonably sure to tell apart the alleles you are interested in 2 - Extract genomic DNA form your sample (if you have some controls with known genotype I would add them to the experimental design) 3 - Amplify your gene via PCR 4 - Purify your PCR product 5 - Ligate it to a convenient plasmidic vector 6 - Transform it to competent bacteria 7 - Plate your bacterial culture 8 - Pick a few (say, ten) colonies expected to harbor the recombinant vector 9 - Prepare mini preps 10 - Sanger-sequence them with a standard primer 11 - Remove vector sequences from sequencing results 12 - Map them against your genomic references and visualize the genotype of your sample
Simply looking at the electropherogram is often doable, but the protocol above is more resolutive. And yes, double peaks in the electropherogram would hint to the co-presence of multiple amplicon variants (which in your case would likely mean multiple alleles).
1 - Design PCR primers in such a way that you are reasonably sure to tell apart the alleles you are interested in
I think this is my problematic mental step i was asking for from the beginning...how is this possible if i have two copies of the gene/ allels on two chromosomes and i use the reference genome for the primer design?
Of course the approach gives for granted that you have some pre-existing knowledge about the alleles you are talking about. For instance, say that you have a gene with two known alleles, differing for a few nucleotides in a given position. You would design primers on constant regions across this variable position, so that you will have virtually no bias favoring one of the two alleles in the PCR step, and so that you would get different amplicons allowing you to distinguish the two alleles later on.
To design primers you should: - gather some knowledge about the alleles you are interested in (either via literature, public databases, dedicated wet-lab research etc.) - maybe make a local map, using an allele of your choice as a reference (does this answer your question? It could be a good idea to use the one you would find in the reference genome) - annotate, on that map, positions differing among alleles - design primers on constant regions, spanning such positions
How should I know the differences in few nucleotides in the two alleles if this is what i want to find out? Meanwhile my direct sequencing of the exome of interest gave me two exact same height of amplitudes for two nucleotides... this seems to be proof enough to show heterogenity.
Hi myrrlicht, the explanation is already found above. If you have gone for the electropherogram-based method and observed a few double peaks on a single exon (not exome, which is a different thing!) then no other action is needed.
If you want to understand the exact sequence of each variant of the exon, you have to clone the single variants and sequence them separately. If the variants are known, at this point you could probably attempt to solve ambiguities in the electropherograms by using one allele as a reference. For instance, say that you have two known alleles, 1 (5'-aaATaa-3') and 2 (5'-aaGCaa-3'). Your electropherogram might read as 5'-aaRYaa-3'. Since there is no allele with sequence 5'-aaGTaa-3' or 5'-aaACaa-3', you know that the allele composition of your sample is exactly 1 and 2.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Probably important to mention the technology you are using.