Regression factor levels question
0
0
Entering edit mode
20 months ago
Jára • 0

Hi all,

I am experimenting with logistic regression for combinations of two SNPs to assess their combined effects on risk of disease.

I have some different permutations of these two SNPs (het_het, hom_het) etc and I've assigned them factor levels. I have set one of the factor levels as the reference level (ref_ref). I then run my regression and get coefficients for each factor level with respect to the reference. My issue is that if I perform separate analyses, let's say one where it's just het_het vs. ref_ref, the coefficients are different to when I include them together.

My understanding is that this shouldn't be the case, so I'm a bit puzzled. The covariates are always the same and ref group is always the same. There is just a big difference depending on whether I include all factor levels or test them one at a time vs. ref.

Any ideas or pointers?

logistic-regression snp • 528 views
ADD COMMENT
1
Entering edit mode

When you refer to separate analysis do you mean splitting your dataset and then fitting multiple regressions? If you do this you would expect the coefficients to be different from the full dataset since you are giving it a reduced subset of your larger dataset for each fit.

ADD REPLY
0
Entering edit mode

Hm, yes I think that's what I'm doing actually. Would it be better to just include all the factor levels together? Should the separate analyses work out alright if I add another factor to capture all "other" combinations and thus not reduce the size of the dataset?

ADD REPLY
1
Entering edit mode

I would include them all in one model.

As a side note, if you have complex model designs its often easier to use a package like emmeans to work with and explore your fit model.

ADD REPLY

Login before adding your answer.

Traffic: 2161 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6