DGE/DEG Analysis for comparing multiple cell lines
1
0
Entering edit mode
21 months ago
joe_genome ▴ 50

Hello community,

I'm relatively new to DGE/DEG analysis using RNA-Seq data, for which I've seen that DESeq2 is one of the go-tos for differential gene analysis. I am a bit confused about the list of genes I am obtaining and which type of normalization methods are best to use (variance stabilizing vs r-log).

I have an experimental design that consists of two cell lines, one which has applied the same treatment and another that is control (done in triplicates).

Experimental Design

sample    condition
NALM6     control (x3)
NALM6     treatment (x3)
SEM       control (x3)
SEM       treatment (x3)

Code for Experimental Design

expDesign <- data.frame(
  row.names = colnames(geneCounts),
  sample = c(rep("NALM6", 6), c(rep("SEM", 6))),
  condition = c(rep("control", 3), rep("treatment", 3), rep("control", 3), rep("treatment", 3))
)

Code for Running DeSEQ2

# Constructing the DESeq2 object
dds <- DESeqDataSetFromMatrix(countData = geneCountsMat, 
                              colData = expDesign, 
                              design = ~ condition)

"Running DESeq"
#Use dds object previously created
dds <- DESeq(dds)

Results Example

DataFrame with 10 rows and 2 columns
                log2FoldChange         padj
                     <numeric>    <numeric>
ENSG00000196230       -2.31206  0.00000e+00
ENSG00000112972       -1.96868  0.00000e+00
ENSG00000182831        1.94195  0.00000e+00
ENSG00000116830       -1.35854 4.06128e-141
ENSG00000111602       -2.61308 5.67284e-136
  1. Does DESeq2 differentiate between cell lines or should I run DESeq2 separately per cell line (control vs treatment)?
  2. How can I know which genes are most differentially expressed per cell line according to treatment?
RNA-Seq GeneExpression DESEQ2 • 1.6k views
ADD COMMENT
0
Entering edit mode
21 months ago
Basti ★ 2.0k

"Does DESeq2 differentiate between cell lines" : you did not provide this information in your design so the answer is no

"should I run DESeq2 separately per cell line" : it will depend on your biological question, you can either run two differential analysis e.g. control vs treatment for each cell line (have a look at the DESeq2 vignette in the "contrasts" part), or runnding differential analysis between all control and treatment cells after adjusting for cell line in your model

"How can I know which genes are most differentially expressed per cell line according to treatment?" : this is almost the same answer, I think the best approach is to use contrasts

ADD COMMENT
0
Entering edit mode
  1. Thought this was included in my design, where NALM6 and SEM each correspond to separate cell lines. I'll check the contrasts part of DESeq2, was unaware of this.
ADD REPLY
0
Entering edit mode

Basti would something like this work? This after running DeSeq2 of coruse:

dds <- DESeqDataSetFromMatrix(countData, colData = sampleTable, design = ~ cellLine + condition + cellLine:condition)

# Perform differential expression analysis
dds <- DESeq(dds)
res <- results(dds)

# Extract DEGs for N6 cell line
n6_res <- res[res$cellLine == "N6", ]
n6_degs <- n6_res[abs(n6_res$log2FoldChange) > 1 & n6_res$padj < 0.05, ]

# Extract DEGs for SE cell line
se_res <- res[res$cellLine == "SE", ]
se_degs <- se_res[abs(se_res$log2FoldChange) > 1 & se_res$padj < 0.05, ]
ADD REPLY
0
Entering edit mode

You want to change your results call to something like this:

res <- results(dds, contrast = c("condition", "cell_line_1", "cell_line_2"))
ADD REPLY
0
Entering edit mode

Trivas would the condition still apply for control vs treatment here?

ADD REPLY
1
Entering edit mode

I would recommend reading the help documentation for results() using ?results.

contrast    
this argument specifies what comparison to extract from the object to build a results table. one of either:

a character vector with exactly three elements: the name of a factor in the design formula, the name of the numerator level for the fold change, and the name of the denominator level for the fold change (simplest case)

a list of 2 character vectors: the names of the fold changes for the numerator, and the names of the fold changes for the denominator. these names should be elements of resultsNames(object). if the list is length 1, a second element is added which is the empty character vector, character(). (more general case, can be to combine interaction terms and main effects)

a numeric contrast vector with one element for each element in resultsNames(object) (most general case)

I think the easiest to wrap your head around is using the resultsNames() coupled with relevel()

ADD REPLY
0
Entering edit mode

Have some reading to do for sure, thanks for the help!

ADD REPLY

Login before adding your answer.

Traffic: 1575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6