Control samples size is much lower than tumor samples in TCGA
0
1
Entering edit mode
7.6 years ago

I have tried to use TCGA glioblastoma RNA-seq samples to apply differential expression. However i realized that the number of "Solid tissue Normal" is much lower than "Primary tumor" samples. For fpkm files in glioblastoma, there are 6 normal and 161 primary tumor samples.

Is this true? Am i missing something?

TCGA Differential Expression RNA-Seq • 1.8k views
ADD COMMENT
0
Entering edit mode

I wouldn't be surprised to have few normal samples since you're dealing with brain tissue here. You don't often remove normal brain tissue whether from the cancer patient or someone else.

ADD REPLY
0
Entering edit mode

You are right, but 124 of them are dead. So do you think that differential expression in this condition is statistically true?

ADD REPLY
0
Entering edit mode

Samples are usually obtained during surgery, trying to keep people alive. That's what I would assume unless there are more details on the samples provenance. Regarding the different sizes of the groups, statistical tests do not assume anything about sample size. In particular, as long as the assumptions of the test hold, the type I error (i.e. calling a difference statistically significant when it is not) is not affected. However, the power of the test (i.e. the probability of rejecting the null hypothesis when it is false) is reduced, this means that the probability of making a type II error (i.e. concluding there is no difference when there is really one) is increased. To put it in less mathematical terms, larger sample sizes make it easier to detect smaller differences. Also keep in mind that statistical significance and biological relevance are not linked a priori.

ADD REPLY

Login before adding your answer.

Traffic: 1765 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6