Question

SciClone on WES data

1

Entering edit mode

7.5 years ago

mar.ark.parr ▴ 40

Hello everyone!

I have recently installed SciClone (SciClone installation in R version 3.3.3) and now trying to apply it to WES data of two tumors.

Firstly, I used VarScan2 to call somatic variants, then I separate output file on germline, somatic or LOH. I took somatic (as vaf data) and LOH (as regions to exclude) files with high-confidence. Also I used VarScan2 to call somatic copy number alterations. Then I used DNAcopy to segment CNA and LOH data. All commands I used were taken from manuals. At the next step, I manually deleted 'chr' letters and all data related to sex chromosomes sites.

Unfortunately, my attempts to run SciClone in these data haven't been successful. When I feed the data preprocessed this way to SciClone (command from github page) I get:

a)with regions to exclude

[1] "checking input data..." [1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2." can't do clustering - no copy number 2 regions to operate on in sample 1

b)without regions to exclude

[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2." [1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2." 1 sites (of 8350 original sites) are copy number neutral and have adequate depth in all samples 8346 sites (of 8350 original sites) were removed because of copy-number alterations 8294 sites (of 8350 original sites) were removed because of inadequate depth 8349 sites (of 8350 original sites) were removed because of copy-number alterations or inadequate depth [1] "clustering each dimension independently" Disable overlapping std dev condition [1] "ERROR: only 1  points - not enough points to cluster when using 10 intialClusters. Provide more data or reduce your maximumClusters option" [1] "finished 1d clustering A ..." [1] "found -Inf clusters using bmm in dimension A" NULL Error in data.frame(..., check.names = FALSE) :   arguments imply differing number of rows: 1, 0 In addition: Warning message: In max(marginalClust[[i]]$cluster.assignments, na.rm = T) :  no non-missing arguments to max; returning -Inf

From what I read in this topic (sciClone error for two sample) my problem might be that two columns of samples related to two tumors are different.

So, my questions are: 1) What is recommended way to prepare data for SciClone? 2) Does anyone know how I can filter files with vafs of somatic mutations to have automatically? At which step of preprocessing I should do it?

sciclone wes clonality varscan dnacopy • 3.9k views

ADD COMMENT • link updated 6.1 years ago by aliz0611 • 0 • written 7.5 years ago by mar.ark.parr ▴ 40

score 0 · Answer 1 · 2017-06-07

[1] "checking input data..." 
[1] "Not all variants fall within a provided copy number region. The copy number of these variants is assumed to be 2." 
can't do clustering - no copy number 2 regions to operate on in sample 1

This error means that every mutation in your sample falls either in a copy-number altered region (outside 1.75 to 2.25 by default) or in one of the regions that you've listed for exclusion. If there is no data to operate on, there's no way to do clustering.

1 sites (of 8350 original sites) are copy number neutral and have adequate depth in all samples 
8346 sites (of 8350 original sites) were removed because of copy-number alterations 
8294 sites (of 8350 original sites) were removed because of inadequate depth 
8349 sites (of 8350 original sites) were removed because of copy-number alterations or inadequate depth

Trying it the other way expands on that - now you have exactly one point that doesn't fall into a copy-number altered region, despite having lots of input mutations. I would check your data closely - are you sure that you've provided copy number data in the appropriate format? Have you visualized the CN data to make sure that it makes sense?

score 0 · Answer 2 · 2018-10-05

0

Entering edit mode

6.1 years ago

aliz0611 • 0

Typically, seg files usually will have seg.mean values close to zero, a log ratio of 0 means tumor and normal have equal copy number. If your segment means are close to 0 (instead of close to 2 as SciClone would expect), the seg.mean column will need to be transformed into the absolute copy number. Assuming that the expected ploidy is 2, use:

cn1$segment_mean <- 2^(cn1$seg.mean+1)

This was derived from the following equation:

seg.mean = log2 ( CALLED_PLOIDY / EXPECTED_PLOIDY), where EXPECTED_PLOIDY is 2

ADD COMMENT • link 6.1 years ago by aliz0611 • 0

1

Entering edit mode

You can also just add the cnCallsAreLog2=TRUE param when running sciclone

ADD REPLY • link 6.1 years ago by Chris Miller 22k