Hello everyone!
I'm trying to identify somatic CNV and somatic SVs for a bunch of tumors, having both WES and WGS samples Our sequencing is made in a multitude of different providers across the years, all short-read illumina sequencing. The pipeline is also consistent: We use GATK to identify germline mutations and I've used multiple consensus calling to remove any somatic mutations. From that, I used the following: CNVkit + b-allele frequency with or without theta2 for clonality/normal contaminitation
For SVs, I used MANTA
In CNVkit, I also used a panel of normals consisting of 50 WGS samples, and for the WES analyses, 500 samples
The problem I'm having is that for almost all tumors (90%), both CNVkit and Manta shows massive chromossomal losses, for the same samples. The problem is that we do not expect that, and our samples have been sequenced with many different methods but we never saw anything that indicated multiple whole chromossome losses.
One example is below:
The same sample with theta2 present a the same losses, but normalized to a diploid genome. Within results from MANTA somatic pipeline, it has over 150K SVs, covering similar areas.
Has anyone seem that before? I know these are tumours and we are thinking of doing optical genome in case this is real, but for now it seems like an artifact. Does anyone have any suggestion?
Thanks