I downloaded ABSOLUTE example data from http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/ABSOLUTE/2. When I checked the input segment file and result data, I find very confusing column marked as "copy_ratio".
> segtab2 = readr::read_tsv("~/Downloads/ABSOLUTE exampledata/1131213/solid_tumor.segtab.txt")
> head(segtab2)
# A tibble: 6 x 16
sample Chromosome Start.bp End.bp n_probes length seg_sigma W copy_ratio modal_cn expected_cn subclonal
<chr> <int> <int> <int> <int> <int> <dbl> <dbl> <dbl> <int> <dbl> <int>
1 solid… 1 3218610 1.44e7 5840 1.12e7 0.00164 0.00394 0.456 2 2.08 0
2 solid… 1 14450545 1.45e7 2 2.46e3 0.0884 0 0.131 0 0.583 0
3 solid… 1 14455748 3.17e7 9063 1.72e7 0.00131 0.00604 0.462 2 2.11 0
4 solid… 1 31682380 3.17e7 40 5.37e4 0.0198 0.00002 0.340 1 1.48 0
5 solid… 1 31740706 3.18e7 4 9.93e3 0.0625 0 0.568 3 2.69 0
6 solid… 1 31751191 3.18e7 35 7.87e4 0.0211 0.00003 0.692 3 3.52 0
# ... with 4 more variables: cancer_cell_frac <dbl>, ccf_ci95_low <dbl>, ccf_ci95_high <dbl>, hz <int>
> segfile = readr::read_tsv("~/Downloads/ABSOLUTE exampledata/SNP6_solid_tumor.seg.txt")
> head(segfile)
# A tibble: 6 x 6
Sample Chromosome Start End Num_Probes Segment_Mean
<chr> <int> <int> <int> <int> <dbl>
1 TCGA-DK-A1A6-01A-11D-A13V-01 1 3218610 14449771 5840 -0.133
2 TCGA-DK-A1A6-01A-11D-A13V-01 1 14450545 14453003 2 -1.94
3 TCGA-DK-A1A6-01A-11D-A13V-01 1 14455748 31677633 9063 -0.114
4 TCGA-DK-A1A6-01A-11D-A13V-01 1 31682380 31736123 40 -0.558
5 TCGA-DK-A1A6-01A-11D-A13V-01 1 31740706 31750640 4 0.184
6 TCGA-DK-A1A6-01A-11D-A13V-01 1 31751191 31829873 35 0.468
> head(segtab2$copy_ratio)
[1] 0.45600 0.13051 0.46211 0.33972 0.56813 0.69164
As I know, to get copy ratio, we calculate 2^Segment_Mean
, which for Segment_Mean -0.133 should be 0.9119964, but ABSOLUTE all computed as 0.5 * 2 ^ Segment_Mean
.
How should I understand this?
Thanks, Shixiang