Question

Consistent somatic CNV calling across samples: spurious, error or real?

0

Entering edit mode

4 months ago

avelarbio46 ▴ 30

Hello everyone!

I'm trying to identify somatic CNV and somatic SVs for a bunch of tumors, having both WES and WGS samples Our sequencing is made in a multitude of different providers across the years, all short-read illumina sequencing. The pipeline is also consistent: We use GATK to identify germline mutations and I've used multiple consensus calling to remove any somatic mutations. From that, I used the following: CNVkit + b-allele frequency with or without theta2 for clonality/normal contaminitation

For SVs, I used MANTA

In CNVkit, I also used a panel of normals consisting of 50 WGS samples, and for the WES analyses, 500 samples

The problem I'm having is that for almost all tumors (90%), both CNVkit and Manta shows massive chromossomal losses, for the same samples. The problem is that we do not expect that, and our samples have been sequenced with many different methods but we never saw anything that indicated multiple whole chromossome losses.

One example is below: CNVKit WGS no theta2

The same sample with theta2 present a the same losses, but normalized to a diploid genome. Within results from MANTA somatic pipeline, it has over 150K SVs, covering similar areas.

Has anyone seem that before? I know these are tumours and we are thinking of doing optical genome in case this is real, but for now it seems like an artifact. Does anyone have any suggestion?

Thanks

MANTA CNVKit SV somatic CNV • 225 views

ADD COMMENT • link updated 4 months ago by GenoMax 148k • written 4 months ago by avelarbio46 ▴ 30