Entering edit mode
7.2 years ago
Jon17
▴
20
Is there a way to get these metrics on a whole genome scale?
I might do a hack where I create a bed & interval files covering the entire genome.
- HS_Penalty_20X - "hybrid selection penalty" incurred to get 80% of target bases to 20X.
- HS_LIBRARY_SIZE - estimated number of unique molecules in the selected part of the library.
I'm guessing for HS_LIBRARY_SIZE this is # of reads left post duplicates removed for whole genome sequencing?
But for HS_PENALTY_20X should I just calculate this manually? Using the formula in the documentation:
- PF_ALIGNED_BASES / ($REFERENCE_SIZE * 20) = HS_PENALTY_20X.
That doesn't seem like it will work... :-p