Hello,
I did an assembly with Pacbio Hifi reads. The metrics are very well.
I decided to do a SV calling with pbsv
tool : I aligned the initial reads on this assembly with minimap2
, and I detected many SVs (insertions, deletions, translocations...). It is a pretty small diploïd fungi genome, easy to deal with.
How can I detect these kind of SVs with reads who served to build the assembly. Should it be "linear" , I mean with no variant? The alignment of the reads should give a perfect alignment, without any variation?
For example : if I detect a deletion, and if the proportion of initial reads at this position is 50/50 (50% of the reads with the deletion, 50% without because of the diploïdy) , is the assembly collapse here without the deletion? And so, the deletion is detected with the 50% of the remaining reads during the SV calling?
Best
Given your genome is diploid, and genome assemblies tend to collapse all information into a single linear molecule to represent an assembly, finding SVs could mean a few things:
pbsv
calls breakpoints, but if tandem repeats were collapsed in your assembly, then duplicates may be called at that loci if coverage information is included in the SV call.Thanks for you feedback :
hifiasm
. Then, I ransvim-asm
to call the variations : when I align the two phases on the haploid version, I noticed one of the phases has an insertion at a speficic loci, the other has not (confirmed with IGV). I also aligned the initial reads, and there is a 50/50 proportion of them which have this deletion. So, the deletion is collapsed during the assembly step.