I am attempting to use CNVkit, and have successfully run it on a test exome sequencing sample (non-cancer blood) vs a panel of other samples (non-cancer blood). The test sample contains a very large homozygous deletion that should be trivial to detect. The deletion is not called using cbs (default parameters or threshold 0.2) or flasso -- with warning like
DtypeWarning: Columns (1) have mixed types. Specify dtype option on import or set low_memory=False. data = self._reader.read(nrows)
Haarseg detects the deletion, but calls 1255 segments and the output has lost gene names.
The .cnr file clearly shows the deletion, so the early steps of processing are good. cbs run manually is able to easily detect the deletion using default parameters either with or without weights, see below
cnvkit.py batch ../shortcuts/C1-25.bam -r /mnt/capture/cnvkit/Sept30_2015_reference.cnn --output-dir results/
cnvkit.py segment C1-25.cnr -m cbs
library("DNAcopy")
datatab <- read.table("C1-25.cnr", header=T, comment.char="")
CNA.object <- CNA(cbind(datatab$log2),datatab$chromosome,datatab$start,data.type="logratio")
segment.CNA.object <- segment(CNA.object, verbose=1, weights=datatab$weight)
Any help would be greatly appreciated.
Vince
Would you mind tagging this question with "cnvkit" so it's easier to find? I missed it earlier, sorry.