I am working with TCGA data via cBioPortal, and have a question regarding the field Fraction of Genome Altered (FGA). According to a post in the support mailing list the definition is as follows:
The fraction of genome altered is the length of segments with log2 or linear CNA value larger than 0.2 divided by the length of all segments measured. It is based on the value from the segment files.
Since I am not very familiar with the CNA analysis I can’t make heads or tails of that. What exactly do they mean by segments: Are they individual reads? or is it based on genes? what parts of the genome are considered, for example are introns considered in this calculation?
log2(x) > 0.2 means the actual value x > 1.149. Is FGA practically those log2(x) > 0.2 divided by all?
Another interesting question, in my humble opinion, is whether or not we are talking about full-scale duplication, or any expansion. So to give some context, does a CNA value of 1 mean that a particular gene is duplicated entirely? If so, how does one interpret a value like 1.149, which would correspond to the threshold value as mentioned above?