I'm using CNVkit to detect copy number changes in my samples which have been obtained from amplicon sequencing (sequenced region is around 330kb). I have 8 normal samples and 20 test samples. I can detect the copy number regions in the test samples, however how do I find out the copy number changes in the normal samples?
I created a reference CNN using the normals and reference genome (hg19). The CNN file generated does show the chr,start,end, log2, among other things. Can I use the log2 value to determine the copy number regions in the normal samples, or is there any other way to find out that information?
Any help appreciated!
Thanks for the info! That is sort of what I needed. Is there anyway to know the CNV in individual controls?
Yes, you can run the rest of the pipeline with the control samples' targetcoverage.cnn and antitargetcoverage.cnn files versus the existing reference.cnn file, as if the controls were additional test samples. The rest of the pipeline is the same after you've run the
fix
command.Or, if you don't mind some recomputation, run
cnvkit.py batch *_Normal.bam -r cnv-reference.cnn
to reprocess the control samples from scratch using the reference you previously built.In general CNVkit's segmented calls are larger than the common germline CNVs you see in healthy cells, so you shouldn't see much happening in the control samples.