Entering edit mode
9.0 years ago
ankita.mandal28529
▴
10
I have downloaded TCGA copy number variation data sets for my experiment. In CNV SNP_Array level 3 data, there are different file formats for different samples, like: "BONZE_p_TCGAb56_SNP_1N_GenomeWideSNP_6_E07_666882.nocnv_hg19.seg", "HAULS_p_TCGAb47_SNP_2N_GenomeWideSNP_6_F03_628484.nocnv_hg19.seg", "CUSKS_p_TCGAb47_SNP_1N_GenomeWideSNP_6_D05_628212.nocnv_hg19.seg", "URAEI_p_TCGASNP_b85_N_GenomeWideSNP_6_H09_735102.nocnv_hg19.seg" etc.. I do not understand the difference between these formats. Moreover, each sample has multiple number of rows with several columns like: Sample, Chromosome, Start, End, Num_Probes, Segment_Mean. And for every sample these Start and End columns are different. I do not understand how do I prepared the data set where each sample has multiple genes with segmented mean.
0
Entering edit mode
What do you exactly mean with processing? Each row in the seg files correspond to a region in the genome with equal copy number (a segment). You can open the seg files in IGV
ADD REPLY
• link
9.0 years ago by
Irsan
★
7.8k