Entering edit mode
6.5 years ago
bioinforesearchquestions
▴
370
Hi guys,
We have exome sequenced samples at a minimum of 15X coverage using capture kits.
- We would like to
- identify genes that may have low coverage because no probes are available
- identify gene that falls within a region which is difficult to sequence by the capture kit
- how to evaluate coverage at a regions of interest and generate specificity, sensitivity and PPV for that region?
- how to develop statistics based on the existing samples of how well the samples provide on-target coverage, uniformity and ability to detect single nucleotide variants (SNVs), insertions and deletions (indels) and copy number variants (CNVs) in the region of interest.
- Is there any statistics which is already developed?
Kindly experts throw some light on the above queries.
Do you have a bed file with exome probes? If you do, you can use bedtools to achieve several of your goals:
Use bedtools intersect -v with the genome annotation to get the genes / exons not covered by the probes.
Use bedtools coverage.