Entering edit mode
5.0 years ago
hosin
•
0
Hello everyone I want to do a association analysis ( with linear regression model ) for some data that related to age of animals in 40 different populations, But the variables are same in each breed like :
23
23
23
23
11
11
11
11
12
12
12
12
So, is "linear model" a reasonable model for analysis of this data? Thanks
Try to give a little bit more information. I'm assuming that column is age?
To do linear regression, at the very least, you need two variables (Say,
X
andY
.) You want to predictY
based on inputs ofX
. Say here your age is yourX
variable, and you want to predict some other, dependent, variable like height or weight. That would probably work pretty poorly though until you add more variables (breed of animal like you said you have, should be a variable as well).Linear regression is useful if you want to build the model and predict for new animals using new data with the same variables, or if you want the model to determine the relation between Y and Xs variables (e.g. For each year the animal ages, it weights approximately
beta
more kgs.) Difficult to say if it'll work well for your data, but if your hypothesis is that there's a linear relation between your data, then it might make sense.Thank you very much for your time, Actually I want to survey some relationships between some structures in the genome and some variables which one of them is age. My data is like that (I hinted). So the age is one of the variables. But I want to analysis them separately, for example strutures with age- structures with weight. I would be thankful if I could have your suggestions again
Ok. Sounds like linear regression would be reasonable to look at. However there's a lot of different ways you could describe you model; are the variables fixed or mixed effects ? Are there interactions between variables?
To start, you could just try a model where every variable is a fixed effect (just as an exercise to get started). Assuming you're using something like R
lm()
or pythonstatsmodel
and look at the coefficients you get for each variable and the p-values, confidence intervals. Have a look at some tutorials.