Question

The problem with sensitivity analysis of somatic variant caller

0

Entering edit mode

8.6 years ago

phongphak.06 ▴ 20

Hello!

I'm working on somatic variant identification from cancer samples. So, I try to simulate Illumina paired-end read with known variants by using VarSim (http://bioinform.github.io/varsim/), which this tools can simulate the human genome with both germline and somatic mutations (use ART tool as a simulator). In addition, I had simulated tumor reads by vary allele frequency in the range of 0.1 - 0.5.

As this paper "Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers" (DOI: 10.1186/gm495) mentioned about the sensitivity of any somatic variant caller including MuTect, SomaticSniper, VarScan, Strelka, and JointSNVMix. I had followed the command of any tools that they provide in supplement to calling somatic variations. However, the sensitivity is quite low when compare with their reports. The maximum is around 0.9 at allele frequency equal to 0.5 (MuTect) while at allele frequency equal to 0.5 (MuTect) I got the sensitivity just 0.7 (my simulated reads are chr22).

So, I have no idea that why the results are very different, even I use only chr22 but I think the results shouldn't be much different.

Additional, I aligned the reads by using bwa-mem then mark duplicates with Picard-tool and realigned, recalibration with GATK

SNP next-gen sequencing sequence • 2.1k views

ADD COMMENT • link updated 7.9 years ago by biostarsjic • 0 • written 8.6 years ago by phongphak.06 ▴ 20

score 0 · Answer 1 · 2016-12-20

0

Entering edit mode

7.9 years ago

biostarsjic • 0

If you have a very large read depth or have pcr amplicons that create frequent, similar read start/endpoints, removing duplicates can hurt the analysis.

ADD COMMENT • link 7.9 years ago by biostarsjic • 0