Question

[statistics question] What's the right process to find and use interactions between variables?

0

Entering edit mode

8.6 years ago

moxu ▴ 510

Suppose you want to do a linear regression: y = x1 + ... + x6, and you think there is interactions among the predictors. You can do an exhaustive trial-and-error test for interactions, but that would be multi-testing and thus not a good idea, right?

What would be the right procedure to find and use such interactions?

Thank you!

R • 1.2k views

ADD COMMENT • link updated 8.6 years ago by Steven Lakin ★ 1.8k • written 8.6 years ago by moxu ▴ 510

score 2 · Answer 1 · 2016-09-22

You will have to examine how their interaction affects the predicted variable; as far as I know, there isn't another way to see if interactions exist. This is part of the punishing aspect of classical statistics: the more thorough and curious you are about different aspects of your experiment, the larger your p-value becomes due to multiple testing.

If you're inclined, Bayesian statistical approaches to the generalized linear model are the way to go for investigation of many combinations of variables, since your intentions as the experimenter have no effect on the posterior distribution, and there is only one posterior distribution, which you can examine in as many ways as you like. At the bottom of this page is a link to Krushke's scripts for JAGS and Stan (Markov Chain Monte Carlo samplers), which include procedures for Bayesian GLMs.

If you do find an interaction, clearly the variables are not independent. Whether you think an interaction is important depends on how it affects the predicted variable and what you're trying to accomplish. If you find an interaction of interest, including it in your model is generally the correct way to proceed.

Biostars probably isn't the best place to get this information; check out the stats stack exchange for access to people who are very knowledgeable in this area.

score 0 · Answer 2 · 2016-09-22

0

Entering edit mode

8.6 years ago

LLTommy ★ 1.2k

I think you are simply looking for correlation...?

In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or two sets of data. Correlation is any of a broad class of statistical relationships involving dependence, though in common usage it most often refers to the extent to which two variables have a linear relationship with each other

Taken from wikipeadia, where you can also find some statistical tests you might want to use to check it

ADD COMMENT • link 8.6 years ago by LLTommy ★ 1.2k

0

Entering edit mode

Not really, but trying to add interaction among the xi's to the linear regression.

ADD REPLY • link 8.6 years ago by moxu ▴ 510