Hi,
I used CNVkit 0.9.7 to call gCNVs in a WES sample. Both normal (9 samples) and the target sample sequenced using target enrichment with Agilent_SureSelect_v7_post capture kit. I ran the below command.
./cnvkit-0.9.7/cnvkit.py batch *_recal.sample.bam -n *recal.normal.bam --targets Agilent_SureSelect_v7_post_GRCh38-Edited.bed --fasta Homo_sapiens_assembly38.fasta --output-reference reference_1.cnn -d CNVkit_testing/As_Hybrid/ --rscript-path /usr/bin/Rscript -p 8
I have few things to clarify.
1) I get a warning, Most antitarget bins (95.43%, 37050/38823) have low or no coverage; is this amplicon/WGS? Since this is target enrichment I assume this falls under hybrid capture. Is my understanding wrong? Is it correct to use -m amplicon for these samples?
2) For batch command I think I should get a .cns file for my sample. And then running the call command should give a call.cns with absolute CN. But batch command gives a call.cns in addition to .cns file. But my output says filtered by ci, and with default thresholds. As per my understanding I have to run call command on the .cns from batch command, using --filter ci from segmetrics and with thresholds for gCNV calling. But why this is happening at batch command?
Part of output is shown below.
Post-processing CNVkit_testing/As_Hybrid/SC605_recal.normal.cns ... Wrote CNVkit_testing/As_Hybrid/SC605_recal.normal.cns with 42 regions Applying filter 'ci' Filtered by 'ci' from 42 to 28 rows Calling copy number with thresholds: -1.1 => 0, -0.25 => 1, 0.2 => 2, 0.7 => 3 Wrote CNVkit_testing/As_Hybrid/SC605_recal.normal.call.cns with 28 regions Ignoring 38823 off-target bins Significant hits in 1230/228648 bins (0.538%) Wrote CNVkit_testing/As_Hybrid/SC605_recal.normal.bintest.cns with 1230 regions Time: 695.461 seconds (27021 reads/sec, 355 bins/sec) Summary: #bins=246559, #reads=18792233, mean=76.2180, min=0.0, max=2894.4834437086092 Percent reads in regions: 52.847 (of 35559976 mapped)
Appreciate if someone can give a clarification. Thank you Best Sumudu