Hi,
I'm new to contrast / design functions with DESeq2. I have some RNAseq datas from MS patients. Here is an exemple of my sample table:
NAME STATUS CELLTYPE COMPCHLES
S1.A MS CD4mem LESLOW
S1.B MS CD4naiv LESLOW
S1.C MS CD8mem LESLOW
S1.D MS CD8naiv LESLOW
S2.A HV CD4mem HV
S2.B HV CD4naiv HV
S2.C HV CD8mem HV
S2.D HV CD8naiv HV
S3.A MS CD4mem LESHIGH
S3.B MS CD4naiv LESHIGH
S3.C MS CD8mem LESHIGH
S3.D MS CD8naiv LESHIGH
S4.A HV CD4mem HV
S4.B HV CD4naiv HV
S4.C HV CD8mem HV
S4.D HV CD8naiv HV
I want to compare COMPCHLES (number of brain lesions) between my patients, depending on CELLTYPE and the disease STATUS. I tried something like this :
dds <- DESeqDataSetFromMatrix(countData = exprDat, colData = sampleAnnot, design = ~CELLTYPE + STATUS + COMPCHLES)
But its not working, and I dont realy get the following message:
Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed.
If I can have some clues about this warning and some help to solve my problem, it would be nice.
Thanks, Hadrien
Search for
the model matrix is not full rank
, you will find lots of posts discussing the issue and how to solve it. This one may be useful to you:With DESeq2 "Not full rank" Error with design ~ line + time + condition
The rank of a model matrix, loosely speaking, is the maximum number of linearly independent columns to define the model. The info in your STATUS column is contained in the COMPCHLES column, as also told by swbarnes2 in the answer section. That is why, the model.matrix is not full rank and you need to drop at least one of them from the design formula.
BTW, this thread will definitely help you if you are new to the design formula: How is the design in DESeq2 work?
Esp., check this reply from Michael Love (Developer of DESeq2) C: How is the design in DESeq2 work?
Thx for your anwsers and explainations, i'll check this quickly!
Well, Maybe I have an interpreting Question now.
In this case, using this formula :
And check results with:
means check DE genes between "LESHIGH" and "LESLOW", not depending on the cellular type?
yes, that's the right interpretation
Alright, thanks you.
Can I push a bit more? what should I type in design / contrast if I want to check DE depending on COMPCHLES, but in each cellular type? Maybe should I change my design?
since you are interested in the effect of the combination of CELLTYPE and COMPCHLES, you may create a new grouping as follows
Then use only
newGroup
in your design formula. It's level will contain all the combinations of CELLTYPE & COMPCHLES. Then by using thecontrast
as before, you may pull the right contrast to compare.Thanks a lot! Now its time to work:)