Hello everyone!
I have recently installed SciClone (SciClone installation in R version 3.3.3) and now trying to apply it to WES data of two tumors.
Firstly, I used VarScan2 to call somatic variants, then I separate output file on germline, somatic or LOH. I took somatic (as vaf data) and LOH (as regions to exclude) files with high-confidence. Also I used VarScan2 to call somatic copy number alterations. Then I used DNAcopy to segment CNA and LOH data. All commands I used were taken from manuals. At the next step, I manually deleted 'chr' letters and all data related to sex chromosomes sites.
Unfortunately, my attempts to run SciClone in these data haven't been successful. When I feed the data preprocessed this way to SciClone (command from github page) I get:
a)with regions to exclude
[1] "checking input data..." [1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2." can't do clustering - no copy number 2 regions to operate on in sample 1
b)without regions to exclude
[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2." [1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2." 1 sites (of 8350 original sites) are copy number neutral and have adequate depth in all samples 8346 sites (of 8350 original sites) were removed because of copy-number alterations 8294 sites (of 8350 original sites) were removed because of inadequate depth 8349 sites (of 8350 original sites) were removed because of copy-number alterations or inadequate depth [1] "clustering each dimension independently" Disable overlapping std dev condition [1] "ERROR: only 1 points - not enough points to cluster when using 10 intialClusters. Provide more data or reduce your maximumClusters option" [1] "finished 1d clustering A ..." [1] "found -Inf clusters using bmm in dimension A" NULL Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 1, 0 In addition: Warning message: In max(marginalClust[[i]]$cluster.assignments, na.rm = T) : no non-missing arguments to max; returning -Inf
From what I read in this topic (sciClone error for two sample) my problem might be that two columns of samples related to two tumors are different.
So, my questions are: 1) What is recommended way to prepare data for SciClone? 2) Does anyone know how I can filter files with vafs of somatic mutations to have automatically? At which step of preprocessing I should do it?
You can also just add the
cnCallsAreLog2=TRUE
param when running sciclone