Question

SciClone works with 1d clustering, but not 2d?

0

Entering edit mode

6.0 years ago

wisty • 0

Hey there, I can get results from SciClone when clustering a single set of variants, but errors get thrown and analysis halts when I try 2d clustering of two samples, i.e. The analysis works fine when done on each sample separately but not when done together. Why might this be happening? Here's snippets of the input data and commands used:

Sample1 CN Input
1       809600  248685900       -0.0374072665893745
2       41700   242140700       0.00396731936579034
3       361600  195792272       -0.0277007984123769

Sample2 CN Input
1       809600  248685900       0.00186900988973912
2       41700   242794100       0.0235254793059249
3       361600  129152000       -0.085771971887389

Sample1 VAF Input used for 1D clustering
chr1 1179290 0 0 0
chr1 1179290 0 39 100
chr1 2433825 30 5 14.2857

Sample2 VAF Input used for 1D clustering
chr1 981870 28 6 10.7143
chr1 981870 28 22 39.2857
chr1 2428881 82 20 19.6078

Sample1 VAF Input for 2D clustering
chr1    10386355        16      4       20
chr1    109338933       0       0       0
chr1    109778576       0       0       0

Sample2 VAF Input for 2D clustering
chr1    10386355        0       0       0
chr1    109338933       37      6       13.9535
chr1    109778576       0       258     100

The command for 1D and 2D clustering are:

sciClone(vafs=v1, copyNumberCalls=c1, minimumDepth=25, names=c('sampleName'))
sciClone(vafs=(v1,v2), copyNumberCalls=(c1,c2), minimumDepth=25, names=c('sampleName', 'otherName'))

As I mentioned, 1D clustering works fine, it is 2D clustering that fails with some error like:

[1] "checking input data..."
[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2."
[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2."
4 sites (of 624 original sites) are copy number neutral and have adequate depth in all samples
618 sites (of 624 original sites) were removed because of copy-number alterations
620 sites (of 624 original sites) were removed because of inadequate depth
620 sites (of 624 original sites) were removed because of copy-number alterations or inadequate depth
[1] "clustering each dimension independently"
Disable overlapping std dev condition
[1] "ERROR: only 4  points - not enough points to cluster when using 10 intialClusters. Provide more data or reduce your maximumClusters option"
[1] "finished 1d clustering P7 Prim ..."
[1] "found -Inf clusters using bmm in dimension P7 Prim"
NULL
Error in data.frame(..., check.names = FALSE) :
  arguments imply differing number of rows: 4, 0
In addition: Warning message:
In max(marginalClust[[i]]$cluster.assignments, na.rm = T) :
  no non-missing arguments to max; returning -Inf

Something that is confusing me is that I do not need to use cnCallsAreLog2=TRUE to get results with 1D clustering, and the results are minimally different from when I repeat the analysis using cnCallsAreLog2=TRUE. I think there must be a few things I am not understanding about SciClone. Why can I use these CN without cnCallsAreLog2=TRUE? Why does analysis of these samples run independently but not jointly? Any insight is much appreciated.

Edit:typo

sciclone • 1.4k views

ADD COMMENT • link updated 5.9 years ago by Chris Miller 22k • written 6.0 years ago by wisty • 0

score 0 · Answer 1 · 2019-01-15

Sorry that I missed this. You don't have your input vaf files formatted correctly.

You have to actually merge your variant lists and get readcounts for each variant in all samples. (Not called is not the same as 0% vaf!)

See this post (and several other under the sciclone tag) for more info. A: Question regarding for sciClone when doing a 2d plot