454 Coverage Quality
4
2
Entering edit mode
14.1 years ago

I've been given a set of 454 sequences (exome sequencing). The coverage was:

 Mean coverage                          18,93
 1x                                   90,60%
 5x                                   68,09%
 10x                                  49,68%
 20x                                  33,75%
 40x                                  16,28%
 80x                                   0,76%
 100x                                  0,07%

I'm not used to this kind of data , as far as I understand 10% of the bases haven't been covered...

If that was a set of Illumina GA data, I would say that this set was badly covered

But for 454, what should I think of that result ?

Update:

(Human Genome, two chromosomes have been sequenced)

             Count         Average Length Total-Bases
Reads        563.561       328,89         185.348.167
Matched      473.235       328,26         155.345.247
Not matched  90.326        332,16          30.002.920
References   2             127.542.357    255.084.714
coverage next-gen sequencing quality • 3.8k views
ADD COMMENT
0
Entering edit mode

Hi Pierre, here are a few questions to help us assess your data: How many sequences did you get? What's their average length? What is the expected total exome length for your species? Cheers

ADD REPLY
0
Entering edit mode

Eric, thanks, I'll get this information tomorrow morning

ADD REPLY
0
Entering edit mode

It would also help to know more about origins and characteristics of the sample that was sequenced.

ADD REPLY
1
Entering edit mode
14.1 years ago

From purely probabilistic point of view an 18x coverage should lead a lot fewer than 10% uncovered bases.

But since you are saying that this data comes from a more novel methodology(that of exon capture) there might be some larger inherent errors to the process. I recall some papers indicating a 96-98% recovery rate of exons in published results. I would expect less successful results in day to day trials. It might just be some exonic regions do not work well in actually capturing data.

ADD COMMENT
1
Entering edit mode
14.1 years ago
Casbon ★ 3.3k

Need more info: 454 Titanium or FLX and extraction technique and targets.

You say that you targeted two chromosomes, you meann the whole thing or the exomes? How was the extraction done?

For your mean depth 18, a really naive Poisson model suggests that the zero based coverage should be far less than 10% (probability Poisson is zero with lambda=18), but depends on your extraction technology really.

ADD COMMENT
1
Entering edit mode
14.1 years ago
Fiamh ▴ 220

As Istvan pointed out it depends heavily on the extraction approach. The figure I've seen mentioned most consistently is even lower at 80-90% (for example see Dan's recent report from the CSHL meeting).

ADD COMMENT
0
Entering edit mode
14.1 years ago
Ketil 4.1k

There could also be a problem with how the coverage was measured. E.g., if you're using an aligner (like Blast) which masks low complexity, or if you require unambigous matches (in which case repeats might be missed).

Generally, read coverage isn't as evenly distributed as one would like, so stochastic models like Poisson only works as a rough approximation.

ADD COMMENT

Login before adding your answer.

Traffic: 2597 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6