What Is The Difference Between Sequencing Depth And Coverage
1
10
Entering edit mode
13.8 years ago
User 6659 ▴ 990

Hello

Please could you clarify for me the different metrics associated with sequencing data. What is the difference between depth and coverage.

I have seen a question with this title closed already because it is was classed as an exact duplicate to this question. I have read that question and thought I understood the answer. However i have just seen another question on this forum here which leads me to think that depth and coverage are different things as the people posing the question have given different quality metrics for depth and coverage

e.g. depth (ex min. 50X)
e.g. coverage (ex min 90% per sample/per region)
e.g. quality score (ex. 90% of all sites [depth > 50X and Q20])

so in order to stop this question being closed as a duplicate, perhaps my question should be phrased as, please can you explain the terms depth/coverage/quality score given in the question here

Thanks a lot

sequencing read coverage • 51k views
ADD COMMENT
2
Entering edit mode

Hummm... Another definition question! I think we should set a standard answer for this one. It will certainly reaper sooner or later. Where's the wiki?

ADD REPLY
0
Entering edit mode

+1 For the idea to document standard questions and answers on a wiki. And it would be great to make slides based on that as well.

ADD REPLY
14
Entering edit mode
13.8 years ago
Michael 55k

There is no (well defined) difference, see here What Is The Sequencing 'Depth' ? . My impression is that they are often used synonymously.

The definition has to be inferred from the use of the terms in the literature. I often see the terms combined as "depth of coverage" e.g.:

This repeated sequencing is known as genome "depth of coverage."

http://www.ornl.gov/sci/techresources/Human_Genome/faq/seqfacts.shtml

Or in two different ways here (http://en.wikipedia.org/wiki/Chip-Sequencing):

Sensitivity of this technology depends on the depth of the sequencing run (i.e. the number of mapped sequence tags), the size of the genome and the distribution of the target factor. The sequencing depth is directly correlated with cost.

In this article, depth is used to refer to the whole genome, while coverage seems to be used for particular loci. Like in (made up examples) "the genome was sequenced with depth of 10X" vs "coverage of the xyZ gene was low".

http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000163

ADD COMMENT
2
Entering edit mode

Yes, I think that's what was meant, 90% of all sample bases are at least covered once, another use. Some regions are hard to sequence, but it was just an example in that question. Another different use of Coverage in the LASTZ alignment tool documentation "Coverage is the fraction of bases in the entire input sequence (target or query, whichever is shorter) that are included in the alignment block, expressed as a percentage." Again, something very different. In the end language is fuzzy. If we cover 80% of a communication with others intellectually, that I would already consider good depth ;)

ADD REPLY
1
Entering edit mode

Please note that at an average coverage of 2 times the area that is not covered at all is over 10% even for simple statistical reasons (so not taking into account that some areas are really easier to sequence than others).

ADD REPLY
0
Entering edit mode

thanks - what do you think they mean in the question i referred to by coverage = min 90% per sample/per region

ADD REPLY
0
Entering edit mode

thanks. What do you think a coverage of 90% per sample per region means as per the question i linked to? Does that mean that 90% of any particular sequence region is sequenced to any depth? That would suggest 10% of a region isn't sequenced at all which doesnt sound likely

ADD REPLY
0
Entering edit mode

thanks for answering. if 90% per sample per region are covered at least once and 10% of a sample are not covered then the length of the region in question surely has a huge impact on that? How is region length taken into consideration? Is there a certain coverage that ensures 100% of the genome is sequenced at least once or is that no possible due to hard-to-sequence regions?

ADD REPLY
0
Entering edit mode

In theory only infinite coverage ensures 100% of base sequenced at least once with probability 1. In practice for larger genomes that means it is impossible by shotgun sequencing alone.

ADD REPLY

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6