Acceptable "Percentage Of Usable Bases On Target" In Exome Sequencing Experiment
2
0
Entering edit mode
11.5 years ago
Christian ★ 3.1k

Picard outputs a quality metric named PCT_USABLE_BASES_ON_TARGET, which is the number of aligned, de-duped, on-target bases out of the PF bases available.

For a successful exome sequencing experiment, what minimum percentages are acceptable here?

exome sequencing picard • 6.6k views
ADD COMMENT
0
Entering edit mode

Found this statement on genohub:

Enrichment efficiency = Passing Filter (PF) reads mapped to target / Total number of PF reads mapped to reference.

If using the correct blocking oligos, this rate usually ranges from 0.65 - 0.75.

But AFAIK and already discussed below, this metric does not take duplicate reads into account.

ADD REPLY
0
Entering edit mode
11.5 years ago

It's not as much about percentages, as having sufficient coverage of your target.

In general, the companies that make exon capture kits promise about 60% on target. So if you are in that ballpark, the library is okay.

ADD COMMENT
0
Entering edit mode

Yes, but what if of the 60% of on-target reads, 50% are duplicates that you would through away by default? Would you consider this library to be still ok?

That's why I like the PCT_USABLE_BASES_ON_TARGET metric, because it takes duplicate reads into account ("de-duped"). What I am still wondering is what typical values other users get for this metric in their exome sequencing experiments.

ADD REPLY
0
Entering edit mode
10.2 years ago

There is not really a strict threshold for this metric as far as I can see, as it is obviously related to the number of reads you have to begin with and then tells you something about your coverage depth. If your coverage depth is high enough I wouldn't worry - although obviously that depends on what you are trying to do.

I would see this metric more as a diagnostic test to run if I have unusually low or high coverage depth of my target, or the coverage uniformity is biased toward one end of the target. This metric can answer the question of 'why do I have such good/bad coverage depth' (is it because of not enough runs, or because we aren't hitting the target, or maybe because we filtered on quality too aggressively?), but I can't see what else it could tell you.

I'm willing to be corrected here - I am going on what I have read of the PICARD manual.

ADD COMMENT

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6