How are haplotypes/heterozygosity resolved in sequence assembly?
1
0
Entering edit mode
4.9 years ago
DNAlias ▴ 40

I am under the impression that many sequencing assemblers are unable to resolve heterozygosity, and account for it by either separating each variant into different contigs, or the two are fused into hybrid of the two variants.

1) Which of these outcomes is preferable and why?

2) I know that there are variant calling pipelines that require a reference genome, is there a way to recognize alleles during de novo assembly?

assembly • 925 views
ADD COMMENT
1
Entering edit mode
4.9 years ago
Vitis ★ 2.6k

I think the ultimate goal for assembling a heterozygous genome is to fully resolve the two haplotypes, essentially two genome assemblies. Platanus seems to be doing a fairly job dealing with heterozygous genomes. Also, long-read sequencing technologies like Nanopore and PacBio would enable variant phasing and resolution of alleles over long distance. Sometimes, the genetic trick of "trio binning" would also help. Basically, you sequence two parents plus the F1 offspring, so you are able to use the parental variant information to partition the offspring reads into haplotypes and do two assemblies simultaneously.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4120091/

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6476705/

ADD COMMENT

Login before adding your answer.

Traffic: 1879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6