How To Process TCGA Copy Number Variation Level 3 Data

1

Entering edit mode

9.7 years ago

ankita.mandal28529 ▴ 10

I have downloaded TCGA copy number variation data sets for my experiment. In CNV SNP_Array level 3 data, there are different file formats for different samples, like:

"BONZE_p_TCGAb56_SNP_1N_GenomeWideSNP_6_E07_666882.nocnv_hg19.seg",
"HAULS_p_TCGAb47_SNP_2N_GenomeWideSNP_6_F03_628484.nocnv_hg19.seg",
"CUSKS_p_TCGAb47_SNP_1N_GenomeWideSNP_6_D05_628212.nocnv_hg19.seg",
"URAEI_p_TCGASNP_b85_N_GenomeWideSNP_6_H09_735102.nocnv_hg19.seg" etc..

I do not understand the difference between these formats. Moreover, each
sample has multiple number of rows with several columns like: Sample,
Chromosome, Start, End, Num_Probes, Segment_Mean. And for every sample
these Start and End columns are different. I do not understand how do I
prepared the data set where each sample has multiple genes with segmented
mean.

TCGA CNV • 3.5k views

ADD COMMENT • link 9.7 years ago by ankita.mandal28529 ▴ 10

0

Entering edit mode

What do you exactly mean with processing? Each row in the seg files correspond to a region in the genome with equal copy number (a segment). You can open the seg files in IGV

ADD REPLY • link 9.7 years ago by Irsan ★ 7.8k

Login before adding your answer.