Which peak is homozygous and heterozygous in Kmer plot for Genome estimation
1
1
Entering edit mode
9.8 years ago
Prakki Rama ★ 2.7k

Hi all,

How do we know, which peak is homozygous and heterozygous when we generate a kmer plot for estimating genome size? Would be thankful to your directions.

genome kmer Assembly • 6.2k views
ADD COMMENT
4
Entering edit mode
9.8 years ago
thackl ★ 3.0k

Assuming a diploid organism (and two peaks) , the heterozygous peak is the first peak, ideally at 1/2 the coverage of the second, hopefully larger, homozygous peak. This is simply because every homozygous site occurs in two alleles, while every heterozygous site only occurs in one allel, hence producing a signal at half the expected genome coverage

ADD COMMENT
0
Entering edit mode

Thank you. But what about other small peaks appear in the plot after homozygous regions? They must be repetitive regions with higher coverage? Am I right?

ADD REPLY
1
Entering edit mode

Yes, additional peaks after the C2-peak (diploid genome peak) represent regions with higher copy number such as repeats. However, for forming a peak, you need a larger region or many sequences of very similar copy numbers.

Repeats usually don't form a peak, as each repeat is small and different repeats have different copy numbers.

But for example, I've got a plot from a small genome with high gene content, with a small distinct peak at C4. This peak comprises duplicated gene families. Also mitochondrium and chloroplast produce their own peak at their respective coverage (Often 100-10000 times the genome coverage). Partial genome duplications or chromosome aberrations can produce additional distinct peaks as well. And also bacterial contaminations, symbionts and parasites might produce peaks.

You can estimate the "size" of a peak to get an idea of what it represents. Simply sum up the count*coverage of kmers in the peak region.

ADD REPLY

Login before adding your answer.

Traffic: 2285 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6