calculate the heterozygosity of an assembly
2
0
Entering edit mode
9 months ago
sansan96 ▴ 130

Hello everyone,

Do you know of any tool that allows you to calculate the heterozygosity of an assembly? The genome I have is from a diploid plant.

plant assembly Genome genome heterozygosity • 659 views
ADD COMMENT
1
Entering edit mode
9 months ago
dthorbur ★ 2.5k

Do you have access to the raw sequencing data for the assembly? If so, you can map the raw reads to the reference genome, call variants, and then estimate the number of loci with heterozygote SNPs.

However, this value can be inflated by copy number variable loci. Though you could probably disentangle some of these loci through coverage maps and identify loci with more than 2 variants.

ADD COMMENT
1
Entering edit mode
9 months ago
Corentin ▴ 610

Hi,

It is difficult to give you an accurate answer without having more details about your assembly or the sequencing data you used to produce it.

The assembly is usually an haploid representation of the genome, so the concept of heterozygosity does not apply to it (unless it is a phased assembly).

If you want to estimate the level of heterozygosity of the genome itself, you can use a k-mer based approach on the sequencing reads, for example using jellyfish + GenomeScope (cf: http://qb.cshl.edu/genomescope/). This approach works best on reads with low error rates (usually short reads, or the recent, high-quality, long-reads).

Finally, it is not exactly what you asked, but if you want to assess the level of duplications in your assembly (usually resulting from heterozygous regions), then you can use tools such as BUSCO: https://busco.ezlab.org/.

ADD COMMENT

Login before adding your answer.

Traffic: 2027 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6