I have four samples(A,B,C,D) and 2 class gene lists(E,F). for example, one gene, belonged to E, in A showed positivie(referred to as 1), but it in B,C and D showed negative(referred to as 0). how to use glm to predict the role of the samples and gene list on values(1 or 0)? like this one:
pc list id gene
1 E A AIFM1
0 E B AIFM1
1 E C AIFM1
NA E D AIFM1
0 F A ARAF
0 F B ARAF
1 F C ARAF
1 F D ARAF
thanks in advance!
so sorry.
I have 4 samples (A,B,C and D) and 2 gene lists (E and F). for example, expression value of one gene in E cluster showed 1.1,2.0,0.5 and 0 in A, B, C and D samples, respectively. if the expression value was less than 1.0, then I assigned 0, otherwise, 1.
my question: how to use glm or other methods to predict the role of the samples and gene list on values(1 or 0)?
Do you mean a model like
If that's the case, you can just run glm in R with
Where data contains the data.frame. R should automatically dummy coded the categorical variables.
thanks a lot. I did that. but the result showed that the sample was not significant. I am not sure whether glm is suitable for 2 variables containing 2 and 4 categories.
I hope your dataset is larger than 8 rows. I would not do dichotomizations. Do not switch from the raw values to this zero one coding. You loose power, and not only this is bad.
Indeed, you need a larger dataset.