Hi all,
I am using GISTIC2.0 for a segmented file I have (Segmented using CBS). I can't understand what is marker file. What does a marker represent here? how dp we choose markers?
Thanks
Hi all,
I am using GISTIC2.0 for a segmented file I have (Segmented using CBS). I can't understand what is marker file. What does a marker represent here? how dp we choose markers?
Thanks
The markers file contains the positions of the probes before segmentation was performed. You do not have to supply a markers file with GISTIC versions > 2.0.23.
To understand the purpose of this file, one can cite the original published work (I have highlighted the important points in bold for you):
Some genes are affected by non-overlapping deletions, either on different alleles in one sample or across multiple samples. For such genes, a marker-based score does not weight the presence of all deletions affecting that gene, despite the fact that these events are likely to have similarly deleterious effects on gene function. We have developed a modified scoring and permutation procedure, termed GeneGISTIC, that scores genes rather than markers (Supplementary Methods in Additional file 1). In each sample, we assign each gene the minimal copy number of any marker contained within that gene, and then sum across all samples to compute the gene score. Because genes covering more markers are more likely to achieve a more extreme value by chance, the permutation procedure is adjusted to account for gene size; the score for a gene covering n markers is compared against a size-specific null distribution generated by computing minima overall running windows of size n in each sample and then randomly permuting these minimal values across the genome.
[source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3218867/]
Kevin
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Kevin,
Just revisiting this because I am facing a similar problem. It seems GISTIC 2.0 still requires a segmentation file, which takes in a "Num Markers" field. Do you know how to address this if we are using sequencing data?
Hi Kevin, do you use bins for your data? If I am using bins (like dividing genome into 10k, 20k, etc bins), I would pass on the number of bins each segment contains (because that is how many measurements we have for that segment). If not bins(like in a targeted panel or wxs sample), I would just pass on the number of targeted regions at each segment.
I am no expert on GISTIC, unfortunately. I prefer to use another similar program, called GAIA. You may reach out to the GISTIC developers.
Do you know of any other programs that would work with CNV profiles derived from sequencing data, as opposed to array data?