Hey there, I can get results from SciClone when clustering a single set of variants, but errors get thrown and analysis halts when I try 2d clustering of two samples, i.e. The analysis works fine when done on each sample separately but not when done together. Why might this be happening? Here's snippets of the input data and commands used:
Sample1 CN Input
1 809600 248685900 -0.0374072665893745
2 41700 242140700 0.00396731936579034
3 361600 195792272 -0.0277007984123769
Sample2 CN Input
1 809600 248685900 0.00186900988973912
2 41700 242794100 0.0235254793059249
3 361600 129152000 -0.085771971887389
Sample1 VAF Input used for 1D clustering
chr1 1179290 0 0 0
chr1 1179290 0 39 100
chr1 2433825 30 5 14.2857
Sample2 VAF Input used for 1D clustering
chr1 981870 28 6 10.7143
chr1 981870 28 22 39.2857
chr1 2428881 82 20 19.6078
Sample1 VAF Input for 2D clustering
chr1 10386355 16 4 20
chr1 109338933 0 0 0
chr1 109778576 0 0 0
Sample2 VAF Input for 2D clustering
chr1 10386355 0 0 0
chr1 109338933 37 6 13.9535
chr1 109778576 0 258 100
The command for 1D and 2D clustering are:
sciClone(vafs=v1, copyNumberCalls=c1, minimumDepth=25, names=c('sampleName'))
sciClone(vafs=(v1,v2), copyNumberCalls=(c1,c2), minimumDepth=25, names=c('sampleName', 'otherName'))
As I mentioned, 1D clustering works fine, it is 2D clustering that fails with some error like:
[1] "checking input data..."
[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2."
[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2."
4 sites (of 624 original sites) are copy number neutral and have adequate depth in all samples
618 sites (of 624 original sites) were removed because of copy-number alterations
620 sites (of 624 original sites) were removed because of inadequate depth
620 sites (of 624 original sites) were removed because of copy-number alterations or inadequate depth
[1] "clustering each dimension independently"
Disable overlapping std dev condition
[1] "ERROR: only 4 points - not enough points to cluster when using 10 intialClusters. Provide more data or reduce your maximumClusters option"
[1] "finished 1d clustering P7 Prim ..."
[1] "found -Inf clusters using bmm in dimension P7 Prim"
NULL
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 4, 0
In addition: Warning message:
In max(marginalClust[[i]]$cluster.assignments, na.rm = T) :
no non-missing arguments to max; returning -Inf
Something that is confusing me is that I do not need to use cnCallsAreLog2=TRUE
to get results with 1D clustering, and the results are minimally different from when I repeat the analysis using cnCallsAreLog2=TRUE
. I think there must be a few things I am not understanding about SciClone. Why can I use these CN without cnCallsAreLog2=TRUE
? Why does analysis of these samples run independently but not jointly? Any insight is much appreciated.
Edit:typo