Entering edit mode
7.4 years ago
Zhenyu Zhang
★
1.2k
I am looking for open source tools that can infer mutations from tumor sample without paired normal sample.
I understand you can use tools like MuTect to do one sample calling, coupled with some filtering of common variants. But I am looking for some more complicated tools such as the one FoundationMedicine is using. Do you have any suggestions?
Also, you asked this exact same question previously. Repeating questions verbatim is looked down upon.
hmmm. totally do not remember that post one year ago
You're going to have a hard time determining somatic variants without a paired normal sample. The best you're going to get is to remove common germline variants found in dbSNP, DGV, and the like.
What tool is FoundationMedicine using and how is it any more complicated than any other variant caller? MuTect, GATK, and VarScan2 are all good variant callers - none of them are going to be able to confirm a somatic mutation from a single tumor sample.
Actually tumor only sample doesn't mean there is no normals. I may have overstated what FM is doing, but I really mean is that you can take advantages of tumor impurity, and make probability models if a variant is somatic or germline based on copy number variation, subclonality, and allele frequency.
In the most simple example, a 20% pure sample called a variant with AF close to 10% in a diploid region is likely to be somatic, while AF close to 50% are likely to be germline.
That is exactly implemented in https://doi.org/10.1186/s13029-016-0060-z (https://bioconductor.org/packages/devel/bioc/html/PureCN.html). For targeted panels, this will only work if you have sufficiently large pool of normals, even coverage and either hybrid capture data (gives you off-target reads) or copy number tiling probes.
Looks great, I will have a look. The panel of normals are expected to call CNV from target panels or WXS.
Please define ''complicated". What is FM using?
I'm not sure what you would do other than look for mutations that are not currently in dbSNP at a significant rate in the general population and are below a 50% allele frequency in the sample. A 50% or 100% AF variant might be somatic but is indistinguishable from germline based on AF. Junctions are probably more likely to be tumor-specific. But... if you really want to have any confidence, you need tumor/normal.
Depends. Most clinical FFPE samples have low purity (typically well <75%) and high coverage. Their SGZ algorithm (cited in https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-017-0424-2) then adjusts allelic fractions for purity, ploidy and local copy number.
Our PureCN is fairly similar and can in addition use a pool of normals to adjust for non-reference mapping biases. Usually you can classify 80-99% (depending on purity and CIN) of variants with high confidence. Together with databases and the usual insilco pathogenicity prediction tools, that provides usually enough evidence for mutations in relevant genes.