Hi, everyone,
I just got CNVkit results from multiple samples seperately, but it seems that there's no function to merge CNV results from multiple samples in CNVkit. Or, any command I haven't noticed?
Thanks for your interest.
Hi, everyone,
I just got CNVkit results from multiple samples seperately, but it seems that there's no function to merge CNV results from multiple samples in CNVkit. Or, any command I haven't noticed?
Thanks for your interest.
What type of result to you want from merging samples?
heatmap
command with all of your .cns files.export seg
command to create a SEG file from each .cns file of interest, then use the output SEG files with a recent version of GISTIC2. (The "markers" file is no longer needed in recent versions of GISTIC, I'm told. Just the SEG files should be enough. I haven't tried this myself.)If you build a reference from control samples, the control samples should be prepared and sequenced with the same protocol as the test samples -- NOT WGS if the test samples were sequenced with a target panel. If you do have process-matched controls, then a pooled reference built from those controls is usually better than a control-free reference. Otherwise, just use a control-free reference.
Don't worry too much about the number of segments; instead, use the segmetrics
and call
commands to do further filtering if you need it. You can also repeat segmentation with a more stringent p-value threshold (-t
) to reduce the number of segment breakpoints in the .cns files.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Another question about controls.
I have only 2 controls sequenced in the same strategy with 20 cases. Considering the limited number of controls, I tried to run CNVkit with two strategy, which results in obvious different outputs.
In this case, what can I do to get a more accurate results? Or, can I use the reference constructed with no control samples, then call CNV for all cases and 2 controls, and compare the results of cases and controls at last ?
My data are from targeted sequencing of a 8M region (hybrid capture), if I use WGS data of normal controls from other paper to construct the reference of pooled samples. Is it acceptable ?
Thanks in advance