Hello community,
I'm relatively new to DGE/DEG analysis using RNA-Seq data, for which I've seen that DESeq2 is one of the go-tos for differential gene analysis. I am a bit confused about the list of genes I am obtaining and which type of normalization methods are best to use (variance stabilizing vs r-log).
I have an experimental design that consists of two cell lines, one which has applied the same treatment and another that is control (done in triplicates).
Experimental Design
sample condition
NALM6 control (x3)
NALM6 treatment (x3)
SEM control (x3)
SEM treatment (x3)
Code for Experimental Design
expDesign <- data.frame(
row.names = colnames(geneCounts),
sample = c(rep("NALM6", 6), c(rep("SEM", 6))),
condition = c(rep("control", 3), rep("treatment", 3), rep("control", 3), rep("treatment", 3))
)
Code for Running DeSEQ2
# Constructing the DESeq2 object
dds <- DESeqDataSetFromMatrix(countData = geneCountsMat,
colData = expDesign,
design = ~ condition)
"Running DESeq"
#Use dds object previously created
dds <- DESeq(dds)
Results Example
DataFrame with 10 rows and 2 columns
log2FoldChange padj
<numeric> <numeric>
ENSG00000196230 -2.31206 0.00000e+00
ENSG00000112972 -1.96868 0.00000e+00
ENSG00000182831 1.94195 0.00000e+00
ENSG00000116830 -1.35854 4.06128e-141
ENSG00000111602 -2.61308 5.67284e-136
- Does DESeq2 differentiate between cell lines or should I run DESeq2 separately per cell line (control vs treatment)?
- How can I know which genes are most differentially expressed per cell line according to treatment?
Basti would something like this work? This after running DeSeq2 of coruse:
You want to change your
results
call to something like this:Trivas would the condition still apply for control vs treatment here?
I would recommend reading the help documentation for
results()
using?results
.I think the easiest to wrap your head around is using the
resultsNames()
coupled withrelevel()
Have some reading to do for sure, thanks for the help!