Contigs to Chromosomal Level Assembly
2
0
Entering edit mode
22 months ago

I got a long read data, I assembled it using CANU and looks like a good assembly. The number of contigs is 36 with an assembly size of 45mb (reference is around 42mb). N50 = 243820 number of # misassemblies = 2099 and Misassembled contigs length 44165538.

Now my question is how do I proceed to make it a chromosomal-level assembly?

ngs assembly • 1.6k views
ADD COMMENT
1
Entering edit mode
22 months ago
shelkmike ★ 1.4k

1) Actually, N50 of 243,820 is not a good assembly. Good assemblies nowadays have N50 of contigs on the order of megabases or tens of megabases. You may want to try some other assemblers, for example Flye.
2) How did you measure the number of misassemblies? You can do this only if you have this genome already assembled and use that assembly as a reference.
3) The best way to make chromosomal-level scaffolds is to make Hi-C reads and then use them for scaffolding with a tool like Pin_hic (https://github.com/dfguan/pin_hic) or YaHS (https://github.com/c-zhou/yahs).

ADD COMMENT
1
Entering edit mode

Re: (1) - doesn't this depend on the species, as in, the size of the genome of interest? For instance, I can imagine a good N50 size is different between humans and yeast.

ADD REPLY
0
Entering edit mode

1) I will try other tools as well 2) the number of misassemblies was calculated using QUAST. the already published reference genome was also used as reference in quast 3) If I don't have Hi-C reads ? any other way of doing it ??

ADD REPLY
0
Entering edit mode

If that reference belongs to another specimen of your species, you cannot distinguish misassemblies from actual differences between the genomes.

Using only the long reads that you have, without additional methods (like Hi-C, optical mapping or better long reads), you won't be able to make a chromosome-level assembly.

ADD REPLY
0
Entering edit mode

Yes, it does. But N50 of 243,820 bp is still low even for a yeast genome, unless the assembly was made using short reads. The author used long reads.

ADD REPLY
0
Entering edit mode
22 months ago
shelkmike ★ 1.4k

If the assembly length is 45 Mbp and there are 36 contigs, the average length of contigs is 1.25 Mbp. Are you sure that N50 is only 243,820 bp? This looks contradictory.

ADD COMMENT

Login before adding your answer.

Traffic: 1698 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6