regression model to use to show the difference
1
0
Entering edit mode
6.7 years ago
1769mkc ★ 1.2k

I have wild type and knockout condition, after knockout the level of a certain metabolite goes up, there is difference, as seen in the phenotype,so my question is what kind of regression model to use or any other method to show the difference any suggestion or help would be appreciated

R • 1.6k views
ADD COMMENT
0
Entering edit mode

If you have two groups, have you considered a t-test?

ADD REPLY
0
Entering edit mode
WT   Amo                   GS   Amo
6.92    461.333           6.12  408.000
6.9 460.000         6.98    465.333
18.8    1253.333              12.69 846.000
18.75   1250.000            10.8    720.000
33.36   2224.000        11.2    746.667
21.55   1436.667        11.82   788.000
21.95   1463.333        22.96   1530.667
11.54   769.333     28.41   1894.000
5.22    348.000     47.7    3180.000
16.1    1073.333        3.28    218.667
13.41   894.000     14.2    946.667
31  2066.667        17  1133.333
55  3666.667        25  1666.667
53.4    3560.000        40.2    2680.000
                           53   3533.333
                           41   2733.333

my data is something like this...i mean my number of observation in WT is more than in knockout...would you suggest me to go for t test...

ADD REPLY
0
Entering edit mode

There is no need for the two groups to be the same size for a t-test.

ADD REPLY
0
Entering edit mode

okay but these are not independent upto my understanding because for the im taking the knock out of the same gene which im studying am i correct, if yes then i shall go for Paired t-Test isn;t it?

ADD REPLY
7
Entering edit mode
6.7 years ago

Hello friend,

Assuming that your metabolites have been normalised to the Z-scale and/or are logged (and thus follow a normal distribution), you can just run a binary logistic regression model:

First, get your data in this format:

MyData
           Group   Metab1  Metab2 Metab3 Metab3
Sample 1   WT      11.39   10.62   9.75  10.34
Sample 2   WT      10.16    8.63   8.68   9.08
Sample 3   WT       9.29   10.24   9.89  10.11
Sample 4   KO      11.53    9.22   9.35   9.13
Sample 5   KO       8.35   10.62  10.25  10.01
Sample 6   KO      11.71   10.43   8.87   9.44
...

Then, set your Group variable as factors and specify WT as the reference level:

MyData$Group <- factor(MyData$Group, levels=c("WT","KO"))

Then, I would check each metabolite independently in the logistic regression modelling:

glm(Group ~ Metab1, family="binomial")
glm(Group ~ Metab2, family="binomial")
et cetera

Model p-values, estimates (indicates which way the metabolite expression goes in KO vs WT) / coefficients can be extracted via the summary() funcion applied to the model object. You can also perform Chi-squared ANOVA via anova(MyModel, test="Chisq")

You can set this up as a loop: Question about generalized linear model fitting

---------------------------------

If your aim is to identify a panel of predictors, then, from the results of the above, select the metabolites that are statistically significant and then you will have to perform further test statistics on these to gauge their 'predictive' strength. For example, see:

You can also just do a penalised regression with all metabolites at the same time using the lasso, elastic-net, or ridge penalty: A: How to exclude some of breast cancer subtypes just by looking at gene expressio

Kevin

ADD COMMENT
1
Entering edit mode

kevin thank you very much i was looking for this for my other work...im really really glad that you posted this ...

ADD REPLY
0
Entering edit mode

@Kevin can i use your method for the gene expression or differentially expressed genes so the only thing i need is to model my data as you have mentioned ?

ADD REPLY
1
Entering edit mode

Yes, you can use this same approach for the genes that are differentialy expressed so that you can further reduce the number of genes in your final model.

ADD REPLY

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6