Hi, I am doing de novo genome assembly with canu. I got two contigs, one longer contig and another shorter contig. It seems like the longer one is genome and shorter one is plasmid. When I checked the assembly after aligning the longer contig with reference genome. I see that big part of genome is aligned in different place. It is probably due to circular genome of bacteria. I want to see structural variants compared with the reference genome. I am afraid the misalignment affect on accurate estimation of structural variants. Does anyone have suggestions how to to deal with circular genome on estimating structural variants? Thanks
The aligned genome looks like the one in the following link.
Are you sure this isn't simply that the order of your 2 contigs is different compared to the reference? You can just reorder them and it will be almost a perfect match - or am I misunderstanding?
Also, are you sure this is a chromosome and a plasmid? The reference sequence appears to be a single contiguous sequence, and that would be an enormous plasmid. Or is the plasmid you refer to the tiny turquoise block on the right?
OP has said that there is only one contig in other post I linked above. So this may be just a matter of identifying correct origin of replication perhaps. Or a possibility is that the published reference is incorrect. But that may be a long shot.
Yes, I have only one contig to align with the reference genome. Since, it is a bacterial genome and circular, it may be a matter of identifying the origin of replication. Do you have any suggestion how could I proceed with circular genome? My ultimate goal is to find the structural variants in the genome, so for this I need to align the assembled genome with the reference genome properly.
There is a prior post by original poster about this here: