I've been using CNV-Seq to detect CNV in a tumour normal pair.
CNV-Seq produces 2 different files:
A .count
file, e.g.:
chromosome start end test ref
X 1 1000000 46775 114751
X 500001 1500000 51545 130859
X 1000001 2000000 48616 126085
X 1500001 2500000 49244 126727
And a .cnv
file:
"chromosome" "start" "end" "test" "ref" "position" "log2" "p.value" "cnv" "cnv.size" "cnv.log2" "cnv.p.value"
"X" 1 1000000 46775 114751 5e+05 -0.0481481369630764 8.39828906687997e-11 0 NA NA NA
"X" 500001 1500000 51545 130859 1e+06 -0.0975597262049925 4.48810759735315e-38 0 NA NA NA
"X" 1000001 2000000 48616 126085 1500000 -0.128344519593103 8.97317524652341e-64 0 NA NA NA
"X" 1500001 2500000 49244 126727 2e+06 -0.117155042936424 1.1243550712914e-53 0 NA NA NA
"X" 2000001 3000000 45486 130448 2500000 -0.273431318743759 5.73887669762662e-268 0 NA NA NA
My understanding was that the read counts in these files (columns test
and ref
in both files) represented the normalised counts (i.e. correcting for a difference in sequencing depth between tumour/normal bam files).
However, on plotting these read count values this doesn't seem to be the case, as I consistently see a sequencing depth in the normal sample ref
~2x that seen in the tumour sample test
, which is consistent with the depth we sequence to. This seems odd, as the .cnv
files contains the "final" cnv calls (with associated p-values etc).
Does anyone have any insight into this. I've emailed the corresponding author on the original paper, but it seems that he has moved on.