Hello friend,
Assuming that your metabolites have been normalised to the Z-scale and/or are logged (and thus follow a normal distribution), you can just run a binary logistic regression model:
First, get your data in this format:
MyData
Group Metab1 Metab2 Metab3 Metab3
Sample 1 WT 11.39 10.62 9.75 10.34
Sample 2 WT 10.16 8.63 8.68 9.08
Sample 3 WT 9.29 10.24 9.89 10.11
Sample 4 KO 11.53 9.22 9.35 9.13
Sample 5 KO 8.35 10.62 10.25 10.01
Sample 6 KO 11.71 10.43 8.87 9.44
...
Then, set your Group
variable as factors and specify WT as the reference level:
MyData$Group <- factor(MyData$Group, levels=c("WT","KO"))
Then, I would check each metabolite independently in the logistic regression modelling:
glm(Group ~ Metab1, family="binomial")
glm(Group ~ Metab2, family="binomial")
et cetera
Model p-values, estimates (indicates which way the metabolite expression goes in KO vs WT) / coefficients can be extracted via the summary()
funcion applied to the model object. You can also perform Chi-squared ANOVA via anova(MyModel, test="Chisq")
You can set this up as a loop: Question about generalized linear model fitting
---------------------------------
If your aim is to identify a panel of predictors, then, from the results of the above, select the metabolites that are statistically significant and then you will have to perform further test statistics on these to gauge their 'predictive' strength. For example, see:
You can also just do a penalised regression with all metabolites at the same time using the lasso, elastic-net, or ridge penalty: A: How to exclude some of breast cancer subtypes just by looking at gene expressio
Kevin
If you have two groups, have you considered a t-test?
my data is something like this...i mean my number of observation in WT is more than in knockout...would you suggest me to go for t test...
There is no need for the two groups to be the same size for a t-test.
okay but these are not independent upto my understanding because for the im taking the knock out of the same gene which im studying am i correct, if yes then i shall go for Paired t-Test isn;t it?