Hi everybody,
I am working on array CGH data from a set of tumor samples with potentially high normal contamination. I have performed the necessary pre-processing (background subtraction, within array normalization) on the tumor data, and run the circular binary segmentation on the resulting log ratio values.
I have seen that people often threshold the mean segment values following this step to call gains and losses (a typical value I have seen in literature is a minimum absolute value of 0.3 for log2ratio). However, this process assumes a minimum value of tumor purity, which may not be true for a subset of my samples.
I was wondering if anyone can suggest a more systematic way of approaching this problem in presence of high level (50-80%) of normal tissue contamination in the tumor sample.
Thank you!
Have you estimated the tumor purity of each of your samples already, or is that the next step you're looking to do?
I do have some (possibly not highly accurate) estimates from somatic mutation alllele frequency data, and also the pathology report (more rough).