Query regarding genome coverage in de-novo assembly
1
0
Entering edit mode
14 months ago

we did some genome sequencing of bacteria from a biotech company, They used de-novo assembly for sequencing with Shovill assembly method using Illumina NovaSeq I need to submission those genome data in NCBI but the problem I am facing with their provided modifier details they mentioned genome coverage of a bacteria eg. 497 Could anyone make me understand what it literally means?

I know that genome coverage can be up to 100% it exceeds 100 how it is possible? and a good genome sequencing depth is approximately 30x than what does it mean genome coverage - 497?

please share your point of view or any article/ weblink. Thanks in advance.

genomics Illumina Novaseq • 1.2k views
ADD COMMENT
0
Entering edit mode
14 months ago
shelkmike ★ 1.4k

"Genome coverage" here is the same as "sequencing depth". The genome coverage of 497 means that, on average, a base pair of this genome is represented in 497 reads.

Bacterial genomes are small and, thus, their sequencing is cheap. This is why you can sometimes see very large coverage of bacterial genomes.

ADD COMMENT
0
Entering edit mode

Okay, thank you so much for your response. Can you tell me which is the standard depth for bacterial genome?

ADD REPLY
2
Entering edit mode

A rule of thumb is that a coverage of 50 is usually enough to make good assemblies of prokaryotic and eukaryotic genomes. N50 approximately reaches a plateau when the coverage is 50 or so. For example, see https://pubmed.ncbi.nlm.nih.gov/23593174/ , https://pubmed.ncbi.nlm.nih.gov/26315384/ , https://pubmed.ncbi.nlm.nih.gov/32781410/ , https://pubmed.ncbi.nlm.nih.gov/34485177/ , https://pubmed.ncbi.nlm.nih.gov/32385271/ .

However, there are exceptions. The coverage by Illumina reads (unlike the coverage by Nanopore or PacBio reads) highly depends on the GC-content (https://pubmed.ncbi.nlm.nih.gov/22323520/). The coverage is the highest in regions with the GC-content of approximately 50% and is lower in GC-rich or AT-rich regions. I once assembled a bacterial genome that had several very GC-rich regions (https://pubmed.ncbi.nlm.nih.gov/32793774/). Even though the average coverage by Illumina reads was 467, the coverage in these regions was 0. We had to sequence this genome with a Nanopore sequencer to make a complete assembly.

ADD REPLY
0
Entering edit mode

Thank you so much for your informative response.

ADD REPLY

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6