I have 9 small gene expression data sets of tumor and normal. Some of the tumor samples have its matched normal, they are coming from same patient. Whereas some tumor samples those are from different patients than above don't have matched normal, thus they would be independent samples Thus some samples are dependent and some are independent. All the samples follow normal distribution. These samples are distributed in different groups that I have to analyze separately. Since the sample sizes to be compared are varying largely across groups I was curious if I can apply t-test for this kind of data. (All data are normalized gene expression) Sample data:
No. of normal samples No. of tumor samples
group 1 12 57
(Here 12 normal samples have 12 matched tumor samples coming from same patients whereas remaining 45 samples are
coming from different patients and don't have any normal)
group 2 02 33
group 3 11 106
..
..
..
group 9 2 12
I tried looking up for solution but it is really confusing as what statistical test/method to use for such analyses. I would like to know how can I analyze such data group wise to get significant genes?
Thank you!
Hi. I am still a bit unclear about your samples. So you have 9 datasets, each with varying amount of control and tumor samples. In each of the 9 datasets, only subsets of the tumor samples have a matching control and the rest do not?
Are all of these samples from different patients? Are there any biological replicates? What do you mean by significant genes? Differentially expressed genes?
Thank you for your reply Damian.
Yes, in each of the 9 datasets only subsets of the tumor have a matching control and rest do not.
In each dataset the patients those have matched controls are from same patients but the rest of the samples that don't have controls are from different patients.
There are no biological replicates.
With significant genes, I mean differentially expressed genes.
Sorry for not being more clear.