Hello all!
I have one question about how to utilize the CNV level 3 data measured by Affy whole-genome SNP6.0 array.
After I mapped all genes (annotated by UCSC Refseq: refFlat) to the cnv.seg file and no.cnv file, I found that genes appear to be in both regions in the two files, for example:
MSRB1 gene(one isoform) is on chromosome 6, the cnv of this region of this gene, is recorded in both cnv and nocnv files:
The column header line of both file is as follows:
Sample Chromosome Start End Num_Probes Segment_Mean
IN the cnv.seg
file:
FRUIT_p_TCGAb_327_328_329_NSP_GenomeWideSNP_6_A06_1367948 6 149661 18225301 14060 1.045
IN the nocnv.seg
file:
FRUIT_p_TCGAb_327_328_329_NSP_GenomeWideSNP_6_A06_1367948 6 1014281 18225301 11991 1.044
We can see that the the segmentation mean score is almost the same, and the two regions are overlapped!
It is supposed that the CNVs in the nocnv file is de-noised since the they frequently appear in the normal samples kept by TCGA and resources in the batch of broad institute, as described in the tangent normalization part of this pipeline.
So how should we determine the CNV segmentation score of the MSRB1 gene, as 1.045 or 1.044? Or "NA" as it in the segment that seems to be de-noised.
I really need everybody's help! Thank you very much.
Hello bounlu, I was wondering: if, instead of somatic CNVs, I am interested in germline CNVs, then in the example above, would I take 1.045 as the segment mean for germline CNV 149661 to 1014281? In general, this is how I would get germline CNV info from CNV level 3 data? By removing somatic CNV sequences (sequences found only in nocnv.seg files) from somatic+germline sequences (sequences foung in cnv.seg files)?