Hi all,
I have a question regarding CNV analysis (new CNVkit user). I am analyzing tumor samples from whole exome sequencing using CNVkit (v 0.9.3). I followed the workflow from the CNV tutorial:
- First, I run the 'batch' command on all my normal samples to generate a pooled reference (n = 50 normal samples).
- Next, using the pool reference I call copy number information from my tumor samples (n > 700 tumor samples).
- Then, I run the 'metrics' command to evaluate quality of samples, inspect the coverages and remove tumor samples that had extremely high segmentation (all looked ok).
Note, I am using a pooled normal as a reference, since my samples don't have a matched normal control.
I was able to generate the scatter plot from the 'call' command output using the default parameters (attached). As you can see the scatter plot is very noisy. Also, for you reference, I tried increasing the bin size in the 'batch' command to see if this could help in reducing the noise, but this didn't make any difference in the level of the noise.
My question is do you have any suggestion for dealing with noisy samples such as this?
Many thanks for your help in advance! Regards Fil ![Scatter plot from 'call' output (default parmaters)][1]
Dear Eric,
Did you had a chance to check the performance of CNVkit with pooled tumor samples? We have targeted, hybrid capture, sequencing data and have no normal samples. I'm wondering which of the following setups, in principal, should be preferred:
1) Calling CNVs with no control samples (a flat reference).
2) Calling CNVs with pool of tumor samples, that were prepared by the same library preparation method and sequenced together, which will use as a control.
I understand the advantage of using "panel of tumors", specially help to deal with the large variation of depth in the different targets and to reduce batch effects. However, the main drawback is that we will miss real CNAs since they appear in the tumor samples.
many thanks!
Probably 2, unless there are highly recurrent CNAs in your cohort. I would try it both ways and compare the results.