Entering edit mode
4.9 years ago
CY
▴
750
My impression is that small InDel (a couple of bp) is identified through cigar string in BAM and typical CNV (at least thousands of bp) is detected through read depth.
What about InDel or CNV with size these them, say hundreds of bp? These CNV is too long to be covered by read and too short to be detected by statistically comparing the difference of read depth.
Do we have consensus that this kind of CNV is hard to detect? Or is there a common strategy of detection for them? Do we generally think them having impact on protein function (especially in oncology background) ?
The answers are yes to all these questions :) thing get much easier when we switch to wgs, but in Wes we have a blind spot between approx 100bp to 4 kb of coding part variants.
WGS is suitable for detection CNV of large scale, such as several Mb. For intermediate CNV, say several kb, WGS can not provide sufficient power (reads within several kb) read depth based detection. I guess detection of intermediate CNV is not suitable for NGS data.