Question

CNV data from TCGA

2

Entering edit mode

9.5 years ago

Na Sed ▴ 310

I am analyzing CNV data downloaded from TCGA database (level 3) and aim to convert it to a gene-level matrix.

The files are like the below:

Sample    Chromosome    Start    End    Num_Probes    Segment_Mean
BAIZE_p_TCGA_b138_SNP_N_GenomeWideSNP_6_A02_808774    1    3218610    16796721    7253    -0.0198
BAIZE_p_TCGA_b138_SNP_N_GenomeWideSNP_6_A02_808774    1    16796742    17763566    312    -0.3615
BAIZE_p_TCGA_b138_SNP_N_GenomeWideSNP_6_A02_808774    1    17764034    221905958    105172    -0.0073

To convert CNV data to gene-level data, I map genome regions to genes. In some cases, two different regions with different 'Segment_Mean' values are mapped to one gene. In this case, is it correct if I use the average of 'Segment_Mean' values for that gene?

Any thoughts?

It should be mentioned that the data has been obtained using SNP Array 6.0.

Thanks

CNV TCGA • 4.2k views

ADD COMMENT • link updated 22 months ago by Ram 44k • written 9.5 years ago by Na Sed ▴ 310

Ram · Answer 1 · 2015-05-18

2

Entering edit mode

9.5 years ago

Chris Miller 22k

That may or may not be a reasonable assumption - It completely depends on what you're trying to infer from the data. For example: if the breakpoint is truly in the middle of a gene, it means that an amplified copy of the gene is non-functional. That could be equivalent to no amplification, or the truncated (or fused) protein could have unexpected effects. In such a case, interpreting it as an amplification would clearly be wrong.

So there's no easy answer. It's not wrong, per se, to do what you're suggesting, but be aware of the caveats, and be explicit about what you did when you write it up.

ADD COMMENT • link updated 22 months ago by Ram 44k • written 9.5 years ago by Chris Miller 22k

0

Entering edit mode

@Chris Miller, Can you introduce some references about interpreting CNV data? I am new in this field.

ADD REPLY • link updated 22 months ago by Ram 44k • written 9.5 years ago by Na Sed ▴ 310