de novo genome assembly of bacterial genome
1
0
Entering edit mode
4.0 years ago
rthapa ▴ 90

Hi, I am doing de novo genome assembly with canu. I got two contigs, one longer contig and another shorter contig. It seems like the longer one is genome and shorter one is plasmid. When I checked the assembly after aligning the longer contig with reference genome. I see that big part of genome is aligned in different place. It is probably due to circular genome of bacteria. I want to see structural variants compared with the reference genome. I am afraid the misalignment affect on accurate estimation of structural variants. Does anyone have suggestions how to to deal with circular genome on estimating structural variants? Thanks

The aligned genome looks like the one in the following link.

Mauve alignment

assembly genome bacteria • 1.4k views
ADD COMMENT
1
Entering edit mode

Are you sure this isn't simply that the order of your 2 contigs is different compared to the reference? You can just reorder them and it will be almost a perfect match - or am I misunderstanding?

Also, are you sure this is a chromosome and a plasmid? The reference sequence appears to be a single contiguous sequence, and that would be an enormous plasmid. Or is the plasmid you refer to the tiny turquoise block on the right?

ADD REPLY
0
Entering edit mode

OP has said that there is only one contig in other post I linked above. So this may be just a matter of identifying correct origin of replication perhaps. Or a possibility is that the published reference is incorrect. But that may be a long shot.

ADD REPLY
0
Entering edit mode

Yes, I have only one contig to align with the reference genome. Since, it is a bacterial genome and circular, it may be a matter of identifying the origin of replication. Do you have any suggestion how could I proceed with circular genome? My ultimate goal is to find the structural variants in the genome, so for this I need to align the assembled genome with the reference genome properly.

ADD REPLY
0
Entering edit mode

There is a prior post by original poster about this here:

ADD REPLY
0
Entering edit mode
4.0 years ago
h.mon 35k

There is no evidence of structural variation in your assembly.

The Mauve image above actually contains two contigs for both the reference genome and your assembled genome, probably the chromosome and a plasmid (this genomic structure seems to be common in this species, e.g., https://apsjournals.apsnet.org/doi/10.1094/PDIS-06-20-1329-RE ).

If you look at the main bacterial chromosome, you can see your assembly and the reference are completely colinear, the only difference is the arbitrary break of the circular contigs has been made at different locations between the assemblies. CANU breaks the circular assembly at a random location, it may also be slightly imprecise with the sequence at these breaks - you should consider running Circlator to fix these issues.

ADD COMMENT

Login before adding your answer.

Traffic: 2496 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6