full rank designe matrix?
1
0
Entering edit mode
6.7 years ago
star ▴ 350

I have a design matrix for my data as below. I run command for analyzing and comparing different groups together but get error.

I would like to have these comparisons: L4vsL6.L8 , Q3vsQ5.Q7, QvsL

design matrix:

> design

                    organoids_biological_samples   method
    L4_D49_rep_1                              L4      L
    L4_D49_rep_2                              L4      L
    L6_L8_D49_rep_1                        L6_L8      L
    L6_L8_D49_rep_2                        L6_L8      L
    Q3_D49_rep_1                              Q3      Q
    Q3_D49_rep_2                              Q3      Q
    Q5_Q7_D49_rep_1                        Q5_Q7      Q
    Q5_Q7_D49_rep_2                        Q5_Q7      Q

> design$organoids_biological_samples <- factor(design$organoids_biological_samples, levels = c("L4","L6_L8", "Q3", "Q5_Q7"))
> design$method <- factor(design$method, levels = c("L", "Q"))

> all(rownames(design) %in% colnames(data))

> all(rownames(design) == colnames(data))

> Group <- factor(paste(design$organoids_biological_samples,design$method,sep="."))

> design<- cbind(design,Group)

> design.matrix <- model.matrix(~0+Group+method,design)

> colnames(design.matrix) <- c("L4.L", "L6_L8.L", "Q3.Q", "Q5_Q7.Q", "method")

> design.matrix

                    L4.L  L6_L8.L  Q3.Q  Q5_Q7.Q  method
    L4_D49_rep_1       1       0    0       0      0
    L4_D49_rep_2       1       0    0       0      0
    L6_L8_D49_rep_1    0       1    0       0      0
    L6_L8_D49_rep_2    0       1    0       0      0
    Q3_D49_rep_1       0       0    1       0      1
    Q3_D49_rep_2       0       0    1       0      1
    Q5_Q7_D49_rep_1    0       0    0       1      1
    Q5_Q7_D49_rep_2    0       0    0       1      1
    attr(,"assign")
    [1] 1 1 1 1 2
    attr(,"contrasts")
    attr(,"contrasts")$Group
    [1] "contr.treatment"

    attr(,"contrasts")$method
    [1] "contr.treatment"

> edgeR.dgelist = DGEList(counts = data,group = Group)

> edgeR.dgelist = calcNormFactors(edgeR.dgelist,method = "TMM")

> CommonDisp <- estimateGLMCommonDisp(edgeR.dgelist, design.matrix)

Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset,  : 
  Design matrix not of full rank.  The following coefficients not estimable:
 method
R edgeR bioconductor • 3.2k views
ADD COMMENT
2
Entering edit mode
6.7 years ago
russhh 5.7k

The mathematical stuff:

The fifth column of your design matrix is the sum of the third and fourth columns of your design matrix. So you can take a non-zero linear combination of columns 3, 4, and 5 and get the zero vector (col3 + col4 - col5 = 0). Hence the design matrix is not full rank.

The scientific stuff:

All your L4 and "L6_L8" samples were assessed with method "L", and all your "Q3" and "Q5_Q7" samples were assessed with method "Q". It's therefore not possible to distinguish the effect of method "L" vs method "Q" - because the sample-level variability confounds the method-level variability. You haven't explained what your methods/samples correspond to (so I might have missed an important design detail), but to assess the effects of Q vs L, you'd typically assess all your samples with both methods.

ADD COMMENT

Login before adding your answer.

Traffic: 1627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6