I am currently formatting my data for use with the GISTIC 2.0. In the documentation, it states that the Seg.CN column should be: (log2() -1 of copy number)
I want to verify with someone who has used the software before that I should take the log of my copy numbers, and subtract one (1) to get this value.
Also, can anyone explain the significance of subtracting 1?
Try a number fairly close to zero, appropriate for the accuracy/precision of the measurement -- e.g. 0.001 in absolute scale is roughly -10 in log2 scale. GISTIC should handle these values OK.
Yes, that's right. The values of this column are log2-scaled copy number relative to the normal copy number, which GISTIC assumes is 2 (diploid), so in this format the neutral regions will have a value 0.0, losses are negative numbers, and gains are positive.
Subtracting 1 after taking the log is the same as dividing by 2 before taking the log -- you're dividing the observed absolute copy number (e.g. 3) by the expected copy number (2): log2(3/2) = log2(3) - log2(2) = log2(3) - 1 = +0.58
(Why do they do this transformation? This format, SEG, was originally designed for array CGH and SNP arrays, where the measured values are log-scaled ratios like this, not absolute copy numbers.)
Hi Etal,
What to do in case of absolute copy number 0?
Thanks
Try a number fairly close to zero, appropriate for the accuracy/precision of the measurement -- e.g. 0.001 in absolute scale is roughly -10 in log2 scale. GISTIC should handle these values OK.
Thanks Etal. What about the number of markers?