Hello everyone,
I am trying to call rare variants in pools of four tumor samples for a specific gene of interest. For that we did a amplicon next-generation sequencing on our pools. They all come tumor with a monosomy for the chromosome supporting our gene of interest.
I aligned the fastq files to the hg19 assembly using bwa mem and removed primers with BAMClipper (Au CH., 2017). I then take the union of three variant caller : Mutect2, HaplotypeCaller and Freebayes in single sample mode to call variants. A biological step of validation is made with Sanger sequencing for all potentially interesting variants.
The tools that I use take into parameter to call variants the ploidy of the pool (i.e for a pool of 8 germline DNA, the ploidy should be set to 16 - 8x2).
My question is the following : how should I adjust the ploidy parameter for pools of four monosomic tumor ?
My guess is that I should maybe increase the ploidy, and instead of setting it to 4 I should set it to 8 for instance because of the possible tumor contamination.
Does anyone have some thoughts or experience on cases like this ?
Best regards
Alexandre
For both MuTect2, HaplotypeCaller (no experience with FreeBayes)
So DNA was pooled, libraries made and sequenced? The four tumours are all from the same individual?
Is the amplicon panel only on the monosomic chromosome? Or are you using only the interval in gene of interest when calling variants (-L flag for M2, HC)? If so I would start with ploidy = 1 per sample and apply above formula. Otherwise ploidy = 2.