Need to evaluate two large (1Gb) chromosome-level assemblies of the same genome by means of finding large structural variations between the two (duplication, inversion, deletion etc). I am trying to use minimap2
to get this sort of statistics (similiar to somewhat classical nucmer
- show-diff
approach), but I couldn't find any parser for .paf files (only paftools.js
from minimap' creator, but it does not produce desired statistics). Conversion of .paf to .delta and using dna-diff is somehow imperfect.
Do you know any parser of .paf files for finding stuctural variations? Or a workaroung of a problem comparing two large assemblies? Many thanks!
Why not use sam/bam files?
Good idea, sam is much more used. Can yous suggest a particular way of doing the task with sam? I am still not sure if I should do local alignment or global whole-genome one. Much obliged!
I would suggest taking a look at this approach: https://github.com/lh3/CHM-eval/tree/master/dip-call
Thanks a lot, great util. But still I was looking for large SV discovery, and what you shared produces only small SV - SNPs and indels.
I have a script which can convert paf to delta. I have not test for the variant calling using the delta file though. https://github.com/gorliver/paf2delta