How should I determine which is the best variant caller to use for a cancer mutations dataset? I'm working with about 70% average tumor purity so it's not great.
MuSE performs very well with a similar dataset but at 90%+ purity so I'm not sure how it will perform with this data. It seems MuSE outperforms MuTect2 generally but I'm still unsure...
It seems that tumor purity confounds the results so I'm leaning towards using Varscan as it circumvents this as it doesn't use probabilistic framework (like bayesian stats) to detect variants and assess confidence in them however it struggles with sensitivity and fails to pick up somatic SNVs of low allelic fraction so that's a major problem.
I would really appreciate some advice on what to look at when deciding what to use...
Do you have a reference for this statement (I'm genuinely interested I don't mean to argue for or against it)
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1029-6
This is a good comparison, maybe I'm over generalising but the context I'm working with this seems to be the case.
Of course, that is a paper from the MuSE developers. Every variant caller that gets published claims to be better than all the previous ones.
It's pretty easy to outperform 5 other callers when you get to select the data set, the truth set, and the callers to compare against.
In fact I was hoping for a reference other than the authors' paper...
good point, I will get back to you if I find anything worthwhile.