Hi everyone, I am working with a genome assembly of a diploid (hybrid) plant and I have used PECAT for correction, assembly and phashing. Right now I have two sets, the first set is primary/alternative genomes and the second is haplotype 1 and haplotype 2 (dual format) and my question is what is the best way to represent a genome of this type of plant. My reads are CLR and I don't know if the best option for my type data is to present only my primary genome or maybe another version (collapsed genome). I understand that to resolve a genome by haplotypes I will need HiFi reads. Can anyone help me please?
I was also considering to apply purge haplotigs on my primary genome or maybe on a new version of a genome with collapsed haplotypes.
I would really appreciate any comments you can give me.
My genomes look like this:
Primary=360 Mb of size, N50=11Mb
Alternate=310 Mb of size, N50=8Mb
Haplo1=370 Mb of size, N50=11Mb
Haplo 2=320 Mb of size, N50=8Mb
Thanks so much.