I currently have the output data from PennCNV for my CNV data from a SNP Array of cancer data. It looks like this:
chr2:76943205-76949315 numsnp=5 length=6,111 state1,cn=0 TCGAW.BAIZE_A03_808754 startsnp=SNP_A-1784113 endsnp=SNP_A-2059601 conf=16.886
I am interested in separating the segments by STATE, then doing some kind of summary and visualization.
For example, what I'd like to be able to say is that a certain region of the genome (at particular SNPs) has is in STATE 1 in 10% of individuals. My goal is to compare different populations and how often these states occur at different locations between populations.
Where the region could be focal or chromosome arm, the state is given in each segment, but it is summarized for the population.
Does anyone know of a good way to do this? In addition, can anyone suggest segment downstream analyses that I should consider for CNV?