we are possibly planning a switch to Strelka2 from Strelka1 but tests are showing performance is poorer. Specifically for the same data set STRELKA2.9.2 specificity and sensitivity measures are significantly lower than what we were seeing with STRELKA1.0.14.
In both cases default options were used and now I am resetting STRELKA2 options to match STRELKA1 but still getting specificity results 5-10% lower.
Has anybody encountered this too? Any suggestions?
In both cases configureStrelkaSomaticWorkflow.py is run with matched normal and tumour bams and the hg19a.fa reference. No other switches were applied. The STRELKA2 run also is provided with the --indelCandidates ./MANTA_ANALYSIS_PATH/results/variants/candidateSmallIndels.vcf.gz which STRELKA1 does not use (I believe).
Then the STRELKA_ANALYSIS_PATH/runWorkflow.py is run. The resulting snp set ,STRELKA_ANALYSIS_PATH/results/variants/somatic.snvs.vcf.gz is bedtools intersected with a groundtruth set to provide Sensitivty and Specifity numbers.
Aside from minor differences in the some parameter defaults only the introduction of the MANTA set is different. BUT I anm getting a 10% drop in specificity from about STRELKA1 at 0.911 to STRELKA2 at 0.8205.
We have tried to remove lower qss variants from the resultant STRELKA2 set somatic.snvs.vcf.gz but never get a better specificity than about 0.87.
The Strelka2 passing somatic variant thresholds are set to be more sensitive than the default pass settings for Strelka1, so a specificity drop would not necessarily be surprising depending on the sample, however it would be concerning if you found results were still suboptimal after selecting a more stringent threshold for passing variants.
In both our own analysis from the Strelka2 publication (Supplementary Figure 5) and independent analyses from the Lancet publication, Strelka2 is generally outperforming the original Strelka model. This is in addition to numerous analyses supporting our newer methods which are more difficult to provide in publication.
If you'd like support from the development team please report this to https://github.com/Illumina/strelka/issues with whatever details you can provide on your sample and validation data. If something about this sample is leading to poor results with the newer methods we'd certainly appreciate the chance to help patch things up.
How do you know what are true positives and negatives? Is this a reference dataset?
In both cases configureStrelkaSomaticWorkflow.py is run with matched normal and tumour bams and the hg19a.fa reference. No other switches were applied. The STRELKA2 run also is provided with the --indelCandidates ./MANTA_ANALYSIS_PATH/results/variants/candidateSmallIndels.vcf.gz which STRELKA1 does not use (I believe).
Then the STRELKA_ANALYSIS_PATH/runWorkflow.py is run. The resulting snp set ,STRELKA_ANALYSIS_PATH/results/variants/somatic.snvs.vcf.gz is bedtools intersected with a groundtruth set to provide Sensitivty and Specifity numbers.
Aside from minor differences in the some parameter defaults only the introduction of the MANTA set is different. BUT I anm getting a 10% drop in specificity from about STRELKA1 at 0.911 to STRELKA2 at 0.8205.
We have tried to remove lower qss variants from the resultant STRELKA2 set somatic.snvs.vcf.gz but never get a better specificity than about 0.87.
ANY suggestions?