I have a case-control dataset and I want to perform logistic regression and conditional logistic regression based on HLA multi-allelic data, using r. I want to find the effect on specific alleles on a trait. How do I do this what is the format. Most examples are based on SNP biallelic data. For instance at HLA-A I may have up to 30 unique alleles, at HLA-B it could be 50. Should I recode all the alleles and perform logistic regression on genotype pairs?
If you are merely asking this as a technical question, then you can do this in R via
glm()
. Your SNP predictors can be encoded categorically for asAA
,AB
,BB
, or continuously as minor allele counts.Kevin