Hello,
I'm trying to determine what the difference and benefits of genome assembly and genome sequence alignments are when trying to identify structural variants or transpoosons in populations.
I've been scouring the internet but have only really come across the difference between short vs long reads and de novo assembly vs reference-based.
My understanding is that to identify variations in structural variants within a population there seems to be 2 main comparative genomic methods, the first being what the 1KGP and SDGP did and sequence the whole genome, align the reads to the reference genome and end up with a BAM file.
The second is to assemble personal genomes and then compare or align the assemblies to each other and the reference genome or using the Lastz/LiftOver/ChainNets Examples: 10.1016/j.gene.2005.09.031
Thanks in advance.
Thank you for your response,
Can you expand on why a reference creates an a priori bias when trying to identify novel SVs? I'm finding it difficult to grasp the importance of needing a genome assembly when more affluential projects like 1kgp, Human genome diversity project and Simons genome diversity project aren't generating assemblies and are just mapping reads to the reference.