Entering edit mode
4.8 years ago
vctrm67
▴
80
I am a bit confused as to why CNV callers require Mutect output for CNV calling. Callers such as CNVkit specify that it has something to do with allelic imbalance but I'm not completely sure. Could someone elaborate?
You need MuTect output to call the region of LOH. Further information is in this post What does BAF mean?
Ah. Would this be the raw, unfiltered output of Mutect? Ie. Mutect labels calls as either passing its filters, germline risk, or being found in panels, etc. Which set of SNVs is required for CNV calling?
Depends on the tool you want to use. The standard is snvs that were heterozygous in germline.
So for CNVkit, I see this message:
I assume by "b-allele frequencies (BAFs) of the heterozygous, non-somatic SNVs" they require germline SNVs? Why not use GATK's HaplotypeCaller on a normal sample and use the output from that then?
Yes, you get it right :) and yes you can use gatk caller to call variants from normal sample, but what you are actually interested in is how BAFs of germline variants changed in tumor - so you need to "call" these variants in tumor and compare with what you've seen in normal
https://cnvkit.readthedocs.io/en/stable/baf.html
Yes, the unfiltered ones. Cnvkit will filter variants with somatic tag in mutect2 calls.
I just ran CNVkit with and without a vcf input, and the calls look the same apart from a few extra columns of information in the calls with the vcf. Is there a reason? I don't know too much about it but I would have thought that the vcf itself would change some of the calls.
Basically you are looking for regions of LOH and allelic imbalance using the snp information. The different values in the extra columns of cn1 and cn2 means these are regions of LOH or allelic imbalance. So there will be three types of strcutural variants: copy number gain loh, copy number loss loh and copy number neutral loh.