Question

Unreliable Results With Somaticsniper

0

Entering edit mode

12.3 years ago

jiagehao ▴ 10

I have used Somaticsniper to call cancer-specific snp in a pair of cancer and normal tissues:

bam-somaticsniper -q 1 -Q 40 -f ucsc.hg19.fasta ERR031023.bam(normal) ERR031024.bam(cancer) ERR031024.snp.vcf

After collection of different snps, I selected 1000 snps with their somatic score greater than 150.

When I used IGV to visualize and verify the 1000 snps, however, I found a lot of problems. For example, on chromosome M, most of snps were machine artifacts, I saw many reads eithor normal or cancer correctly mapped to human genome, there was no snp, but somaticsniper reported snp.

Moreover, for chromosome 1, there was no read mapped to human genome, but somaticsniper also reported snp there. There are too many false snps, making the result unreliable.

I want to know, is this a problem related to my command line? or Did you face similar problem before? Any suggestion will be appreciated. LI Jia

• 3.1k views

ADD COMMENT • link 12.3 years ago by jiagehao ▴ 10

score 2 · Answer 1 · 2013-01-22

2

Entering edit mode

12.3 years ago

ernfrid ▴ 400

There are a couple things to note or check.

Your command line is incorrect. The tumor should be listed before the normal.
The fact that you seem to have SNVs called where there are no reads suggests that there may be an issue with the reference sequence. Is hg19 the same reference sequence that the reads were aligned to? How do you know that there were no reads mapped there?
SomaticSniper's false positive rate increases with coverage and thus the mitochondrial chromosome typically has a high number of false positives.

ADD COMMENT • link 12.3 years ago by ernfrid ▴ 400

0

Entering edit mode

I redid the somaticsniper snp calling again.The reuslts are much better, however, for high read coverage snp, the false positive rate is really high. Any suggestion for that? thank you for your reply.

ADD REPLY • link 12.3 years ago by jiagehao ▴ 10

0

Entering edit mode

Great! I'm glad that re-running has helped significantly.

We do not have a great solution for high coverage SNVs. The problems there are fundamental to the algorithm and . I won't go into the details. Your best bet is to filter out high coverage calls and/or utilize a different caller for those high depth regions. VarScan 2 does a nice job for us in general, but I'm certain there are other callers that would work well.

ADD REPLY • link 12.3 years ago by ernfrid ▴ 400

score 0 · Answer 2 · 2013-01-23

0

Entering edit mode

12.3 years ago

jiagehao ▴ 10

Ok,thank you so much. Somaticsniper does work well on one pair of normal and cancer tissues. Is that possible that I can call different snps in several pairs of tissues one time? I know many pairs of samples are preferred in GATK snp calling, how about somaticsniper?

ADD COMMENT • link 12.3 years ago by jiagehao ▴ 10

0

Entering edit mode

No, this is not the case with SomaticSniper. It will only handle one tumor/normal pair at a time.

ADD REPLY • link 12.3 years ago by ernfrid ▴ 400

score 0 · Answer 3 · 2013-01-30

Hi, I have used Somaticsniper to call cancer-specific snp in a pair of cancer and normal tissues, there are about 183436 snps different between cancer and normal samples. I would like to select some high quality snps with somaticscore above 200, however, when trying to test the selected snps with IGV visualization, a large number of them don't conrespond to IGV visualization. My question is whether somatic score is a reliable and accurate parameter to select snps for further research. Moreover,as you have already told me, high coverage snps have higher false positive rate, if I want to filter out them, what coverage number do you suggest? above 50? or higher? thank you so much for your reply.