Unreliable Results With Somaticsniper
3
0
Entering edit mode
11.8 years ago
jiagehao ▴ 10

I have used Somaticsniper to call cancer-specific snp in a pair of cancer and normal tissues:

bam-somaticsniper -q 1 -Q 40 -f ucsc.hg19.fasta ERR031023.bam(normal) ERR031024.bam(cancer) ERR031024.snp.vcf

After collection of different snps, I selected 1000 snps with their somatic score greater than 150.

When I used IGV to visualize and verify the 1000 snps, however, I found a lot of problems. For example, on chromosome M, most of snps were machine artifacts, I saw many reads eithor normal or cancer correctly mapped to human genome, there was no snp, but somaticsniper reported snp.

Moreover, for chromosome 1, there was no read mapped to human genome, but somaticsniper also reported snp there. There are too many false snps, making the result unreliable.

I want to know, is this a problem related to my command line? or Did you face similar problem before? Any suggestion will be appreciated. LI Jia

• 2.9k views
ADD COMMENT
2
Entering edit mode
11.8 years ago
ernfrid ▴ 400

There are a couple things to note or check.

  1. Your command line is incorrect. The tumor should be listed before the normal.
  2. The fact that you seem to have SNVs called where there are no reads suggests that there may be an issue with the reference sequence. Is hg19 the same reference sequence that the reads were aligned to? How do you know that there were no reads mapped there?
  3. SomaticSniper's false positive rate increases with coverage and thus the mitochondrial chromosome typically has a high number of false positives.
ADD COMMENT
0
Entering edit mode

I redid the somaticsniper snp calling again.The reuslts are much better, however, for high read coverage snp, the false positive rate is really high. Any suggestion for that? thank you for your reply.

ADD REPLY
0
Entering edit mode

Great! I'm glad that re-running has helped significantly.

We do not have a great solution for high coverage SNVs. The problems there are fundamental to the algorithm and . I won't go into the details. Your best bet is to filter out high coverage calls and/or utilize a different caller for those high depth regions. VarScan 2 does a nice job for us in general, but I'm certain there are other callers that would work well.

ADD REPLY
0
Entering edit mode
11.8 years ago
jiagehao ▴ 10

Ok,thank you so much. Somaticsniper does work well on one pair of normal and cancer tissues. Is that possible that I can call different snps in several pairs of tissues one time? I know many pairs of samples are preferred in GATK snp calling, how about somaticsniper?

ADD COMMENT
0
Entering edit mode

No, this is not the case with SomaticSniper. It will only handle one tumor/normal pair at a time.

ADD REPLY
0
Entering edit mode
11.8 years ago
jiagehao ▴ 10

Hi, I have used Somaticsniper to call cancer-specific snp in a pair of cancer and normal tissues, there are about 183436 snps different between cancer and normal samples. I would like to select some high quality snps with somaticscore above 200, however, when trying to test the selected snps with IGV visualization, a large number of them don't conrespond to IGV visualization. My question is whether somatic score is a reliable and accurate parameter to select snps for further research. Moreover,as you have already told me, high coverage snps have higher false positive rate, if I want to filter out them, what coverage number do you suggest? above 50? or higher? thank you so much for your reply.

ADD COMMENT

Login before adding your answer.

Traffic: 1948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6