Hi Dear friends, Thank you so much for your constant help here. Today I would like to ask a question on linear regression analysis using limma. I have RNAseq data and expression profile is count. Whereas the phenotypic data is continuous data. I want to do linear regression to predict phenotypic futures using 20,000 genes as predictor variables. I mean let say BMI is my phenotype which I would like to predict by the list of the genes I have. Therefore, I may ask you to advise me if I can able to do that using limma? Because what I saw in the user guideline if I am not mistaken is the other way round(predicting genes by phenotype). I would be also grateful if you can able show some line of code on how to start wit it. Thanks again! AD
How many samples do you have for training the linear model, what tissue/s do they come from, is it bulk or single-cell, and how many per individual?
Thanks for your quick replay! It is bulk RNAseq and I have 50 samples. The number of individuals for group A 10, 8 for B group, 12 for group C and 10 for group D.
And what are those groups A--D?
@user_without_id,
Hi, I think I wasn't put my question clearly for you to understand it well. I have four sample groups(A, B, C, D) and each sample groups has biological replicates. Sample Group A has 10 biological replicate, B= 8, C= 12, D= 10. Now I want to model a liner regression
lm(BMI~gene_i + covariates(age))
. So do you thing you could show me how to begin?Best,
Amare
Was your idea 1) to try 20'000 models one for each gene, or 2) dump all the expression of all 20'000 genes into one model? In case of 2), you have very few samples, cf. e.g. this tutorial.