Hi,
This is my first time using the DESeq2 package for RNA-seq analysis. I am analyzing a longitudinal study of three different time points in mice, with three different genotypes of animals - 1 wildtype, 2 mutated. I want to evaluate differential expression as an effect of the different genotypes across each age. For example, I want to evaluate differential expression in the 1 year old mice from each genotype and so on. I am confused on how I should compare expression levels across the genotypes at specific time points, using the code below
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampletable,
directory = directory,
design = ~ Age + Genotype
Any help with how to properly use "design" would be greatly appreciated!
Thank you for the help. However, when I try to include the interaction term, I get an error saying that the model matrix is not full rank. Do you know if this is an issue with the way the experiment was designed? I am attaching an image of my samples. samples
Not full rank means that you don't have every combo of all your elements.. You can't model how genotype and age interact when you are missing some age:genotype combinations. You should have done some p8 d52s.
Well yes, the issue is that there is no d52 Genotype of the p8 Age, so DESeq2 can not calculate the interaction between them.
Thank you all for the help! Do you have any suggestions on how to analyze the data given that I don't have d52 p8? Should I do the analysis separately for the p8 and then analyze the 6m and 12m together?
The DESea2 vignette explains what to do in case of "not full rank matrix". In your case you have an interaction level without samples, which is solvable by manually editing the model matrix, and then providing this to DESeq. See: http://www.bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#levels-without-samples
Thank you very much for the help! I've manually edited my matrix and then provided it to DESeq. I reordered my genotypes using
ddsHTSeq_filtered$Genotype <- factor(ddsHTSeq_filtered$Genotype, levels = c("WT", "mdx", "d52"))
Now my question remains in extracting the relevant contrasts, as you mentioned. I run the
resultsNames()
function on my DESeq object and receive the following contrasts"Intercept" "Age6m" "Agep8" "Genotypemdx" "Genotyped52" "Age6m.Genotypemdx" "Agep8.Genotypemdx" "Age6m.Genotyped52"
I'm not sure which contrasts to use if, for example, I wanted to compare the WT and mdx gene expression at age p8. If you could help with this, it would be greatly appreciated!