What is the minimum number/threshold of gaps (NNNNNN) permitted in a draft genome?
1
0
Entering edit mode
5.4 years ago
Kumar ▴ 120

I did a bacterial whole genome sequencing by illumina platform. The quality raw reads were subjected for de novo assembly by using CGE-Bacterial Analysis Pipeline and obtained 4.9 Mb size draft genome, which is closer to the expected genome size of the bacteria. I have subjected my assembled draft genome for various comparative genome analysis such as, genomic island prediction, pan-genome analysis etc., At some point of time, I have noticed that my assembled bacterial genome has 1463bp N residue (0.001463 Mb out of 4.9 Mb). Is it negligible factor for further downstream analysis, if it not please let me know the possibilities to fix this issue.

Thank you in advance.

Assembly genome next-gen-sequencing • 1.0k views
ADD COMMENT
1
Entering edit mode

There is no defined threshold for permitted number of gaps. Perhaps you will require additional data, for e.g. long read - oxford nanopore / pacbio data to fill the remaining gaps.

ADD REPLY
3
Entering edit mode
5.4 years ago
cschu181 ★ 2.8k

I'd say this is expected. Since you're only using short reads your contigs/scaffolds cannot bridge certain low complexity, repetitive regions, so the assembly process will introduce stretches of N where assembly paths cannot be uniquely resolved. I wouldn't worry about this.

ADD COMMENT
0
Entering edit mode

Thank you cschu181

ADD REPLY

Login before adding your answer.

Traffic: 2209 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6