Entering edit mode
6.5 years ago
CY
▴
750
I benchmarked several somatic variant caller (mostly VarScan and MuTect) on the same dataset and VarScan seems always call more variants than other. My impression is VarScan is little bit loose and MuTect is more stringent. Am I right? Is any specific part in their algorithms that cause this difference (like algorithm itself or particular cutoff)? Thanks
Did you run the entire VarScan2 pipeline, including the
processSomatic
andfpfilter
steps to remove junk calls based on the heuristic filters?Yes. I did that. VarScan seems always got more variant than other tools.
Wait, in your title, you imply that it gets more variant calls - is it more or less?
In any case, finding agreement between the different NGS variant callers is known to be difficult - it just does not happen, even when trying to control for similar parameters. They should all be benchmarked not to each other but to the gold standard in genetic variant calling, which is still Sanger sequencing. Being an inferior methodology, NGS will always struggle to match the precision of Sanger sequencing.
Sorry. This is a typo. I mean more.
I can accept that different tools get different variants. After all, authors claim their "improved" method over existing tool in order to to get published. My concern is that VarScan always getting more variants than other tools implies its loose criteria and high false positive calls.
Well... I did not use fpfilter. Is this new filtering method? First time heard this.
No, it is described in the VarScan2 paper, the sourceforge, the github, and pops up as a selectable option when running plain java -jar VarScan2.jar (from release 2.3.8 on I think).