Hi there,
I am currently trying to design a statistical model for DESeq2 that encompasses what I'm looking for. Essentially, I have a double knockout, heterozygous knockout, wild type, and duplication for 2 cell lines. I put random RIN numbers for the sake of this example. Amongst each group, there are a couple replicates. The covariate file looks like below:
Celltype condition dosage RIN
1 WT 2 (normal) 9
1 WT 2 9
1 WT 2 8.2
1 Mutant 1(het. KO) 7
1 Mutant 1 9
1 Mutant 1 8.2
1 Mutant 0 (homo. KO) 7
1 Mutant 0 10
1 Mutant 0 9.2
1 Mutant 4 (duplication) 9.2
1 Mutant 4 (duplication) 8.3
1 Mutant 4 (duplication) 9
2 WT 2 (normal) 9
2 WT 2 9
2 WT 2 8.2
2 Mutant 1(het. KO) 7
2 Mutant 1 9
2 Mutant 1 8.2
2 Mutant 0 (homo. KO) 7
2 Mutant 0 10
2 Mutant 0 9.2
2 Mutant 4 (duplication) 9.2
2 Mutant 4 (duplication) 8.3
2 Mutant 4 (duplication) 9
Currently, I am trying to see if there are any genes that are affected by this mutation and how the varying doses affect it.
I am thinking of the following model so far but I'm running into an issue with DESeq2, suggesting the model matrix is not full rank maybe due to a nesting issue that I can't figure out.
model=~condition*cell type*dosage + RIN
Thanks a lot!
Please indicate what are condition, cell and dosage in these data (=please give colnames to that table). Is RIN the RNA Integrity number? If so why do you think it should be in the model?
Apologies. Celltype is amongst either neurons or glia. Dosage refers to the number of copies of the allele. This is either 0 (homozygous deletion), 1 (heterozygous deletion), 2 (wildtype), 3 (duplication). RIN is the RNA integrity number yes. I've seen several papers putting RIN into their model so I was just following suit. Assuming that expression is also dependent on RNA degradation, in which RIN or RIN^2 would account for that.