N50 is too short in de novo assembly
1
0
Entering edit mode
3.2 years ago
Takuma ▴ 20

Hello, I am freshman of bioinformatics!

I got illumina short reads (2×150bp) of a beetle, in which reference genome doesn't exist. Pleas see below.
Total sequences after trimming by fastp are about 320,000,000×2, and the genome size is estimated as about 530Mbp by kmergenie. Then I think the coverage is 150bp×320,000,000×2 / 530M=about 180

Now, I am working de novo assembly by platanus with some options ( -u 0.2 -s 3 -d 0.3). But N50 is too short, 992bp in contig and 5329bp in scaffold.

I think this species have highly heterozygous. Should I increase u value, for example -u 0.3? Do you have any ideas to improve N50?

assembly N50 genome • 865 views
ADD COMMENT
1
Entering edit mode
3.2 years ago

Different assemblers may perform radically better/worse depending on parameter settings. So try different parameters/tools.

But, at the same time, it is also possible that your data is biased, not fragmented quite right, which too would lead to small contigs/scaffolds.

Then contamination with other genomes can lead to loss of coverage in critical areas.

Map your reads to the closest relative, investigate the alignments.

ADD COMMENT

Login before adding your answer.

Traffic: 2002 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6