How many samples are good to find significant targets using GISTIC?
1
1
Entering edit mode
4.6 years ago
joyk2a ▴ 30

Hi, I am analyzing around thirty tumor samples. Do you think this size is good to analyze targets with GISTIC? The most of journals I have read report more than 100 samples. I am wondering whether I can find valuable targets within small size cohorts. It will be great if you share your experiences who had analyzed GISTIC or similar analysis program. Thanks alot!

gene • 1.8k views
ADD COMMENT
1
Entering edit mode
4.6 years ago

It is a little low, which may translate as not many regions being statistically significant. However, I would encourage you to try it. You could justify reducing the p-value threshold for the identified recurrent regions, keeping in mind that the relatively low sample n is always a limitation.

Kevin

ADD COMMENT
1
Entering edit mode

Thanks, Kevin! I will try to do it.

ADD REPLY
0
Entering edit mode

Kevin,

How does GISTIC depend on number of samples being analyzed? I have a lot of samples, but I had to split them into batches to run CopywriteR, which is computationally expensive otherwise. I ran GISTIC2.0 on the output from each of these batches.

It is not possible to combine CopywriteR output across batches, so I cannot run GISTIC2.0 on some sort of combined input. Will this be a problem?

ADD REPLY
0
Entering edit mode

It's because GISTIC is a bit different from any standard copy number analysis tool. GISTIC takes, as input, the already-derived per sample CN segments, and then processes all samples combined in order to 'score' copy number events across the entire cohort. In a way, it's doing the same as GAIA, i.e., looking for recurrently-aberrated regions. So, with a low number of samples, it would be difficult for any region to obtain a reliable score. I have used GAIA more than GISTIC, though.

Im not too familiar with CopywriteR, to be honest

ADD REPLY
0
Entering edit mode

Looks like I'm going to have to re-analyze my data. CopywriteR is an R package that generates per-sample CN segments from BAM files. It is computationally expensive (>100GB RAM for 3 samples), so I had to split my samples into batches of 3 each. I'm guessing that cripples GISTIC2.0, because n=3 is nothing in the context of statistical significance.

EDIT: It looks like sample size does not affect gene level scoring in GISTIC2, which is what I'm after. I don't have to re-run all my samples, which is great news for me!

ADD REPLY
0
Entering edit mode

Hi Kevin, I have a similar question. But in my study, I have a tumor sample size of 15. Do you think that I should use Gistic tool or is there any other way to represent the significance? and How I can apply something threshold for gain or loss explained at Cosmic website. You have any idea, I would be very grateful to you.

ADD REPLY
1
Entering edit mode

GAIA is better than GISTIC in terms of ease of use. I would use GAIA.

ADD REPLY

Login before adding your answer.

Traffic: 2718 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6