Question

Analysis with no Differentially Expressed Genes?

1

Entering edit mode

6.5 years ago

vitor.saldanha11 ▴ 10

Hi Guys, I'm running my analysis for differential expression and the output give me no differentially expressed genes. Is there something wrong with my code or that's the true output of my samples?

First I imported the phenotype data:

phenoData <- read.AnnotatedDataFrame("design.txt", row.names = NULL, sep = "\t")

An object of class 'AnnotatedDataFrame' rowNames: 1 2 ... 6 (6 total) varLabels: Samples Treatment varMetadata: labelDescription

Then I ran the RMA:

eset <- justRMA("CN1_L10_T1.CEL","CN2_L10_T1.CEL","CN3_L10_T1.CEL",
            "Ti50_1_L10_T1.CEL", "Ti50_2_L10_T1.CEL", "Ti50_3_L10_T1.CEL",
            phenoData=phenoData)

I constructed the model matrix:

sample <- factor(rep(c("Cont", "Ti50"), each=3))
design <- model.matrix(~0 + sample)
colnames(design) <- levels(sample)
design

Cont Ti50
1    1    0
2    1    0
3    1    0
4    0    1
5    0    1
6    0    1

attr(,"assign") [1] 1 1 attr(,"contrasts") attr(,"contrasts")$sample [1] "contr.treatment"

I constructed the contrast matrix:

contrasts <- makeContrasts(Diff = Cont - Ti50, levels = design)
contrasts

Contrasts Levels Diff Cont 1 Ti50 -1

Then I ran the analysis:

fit <- lmFit(eset, design)  
fit2 <- contrasts.fit(fit, contrasts)
efit <- eBayes(fit2)        
deg <- topTable(efit, coef="Diff", p.value = 0.05, lfc = log2(1.5), adjust.method = "fdr", number = nrow(eset))   
deg

data frame with 0 columns and 0 rows

I noticed that if I put p-value=1 I have 500 degs. So I investigated:

range(deg$adj.P.Val)

0.7866448 0.8526774

Indeed the threshold cuts all my genes. That's the true output ( no genes differentially expressed)? Or I did something wrong?

microarray RNA-Seq differential expression • 3.4k views

ADD COMMENT • link updated 6.2 years ago by Biostar 20 • written 6.5 years ago by vitor.saldanha11 ▴ 10

1

Entering edit mode

Did you inspect the MDS plot (plotMDS() function from limma)? Could you post the image here?

ADD REPLY • link 6.5 years ago by h.mon 35k

1

Entering edit mode

Though you have used established and well used topTable function, Limma developers discourage such practice. Copy/pasted from their manual:

Although topTable enables users to set p-value and lfc cutoffs simultaneously, this is not generally recommended. If the fold changes and p-values are not highly correlated, then the use of a fold change cutoff can increase the false discovery rate above the nominal level. Users wanting to use fold change thresholding are usually recommended to use treat and topTreat instead.

topTable does double filtering (by lfc and p-value(. There are/were considerable discussions among bioinformatics communities regarding this. To address this issue, limma recommends treat and topTreat method in case of filtering by fold change.

ADD REPLY • link 6.5 years ago by cpad0112 21k

0

Entering edit mode

I second h.mon's comment: what does your data look like in the global analyses such as PCA or MDS? If you do not see a clear grouping that follows the design of your experiment, most likely you're not going to get any DE genes. The main task should then be to find out whether that's due to noise or to possibly confounding factors.

ADD REPLY • link 6.5 years ago by Friederike 9.0k

score 3 · Answer 1 · 2018-05-02

It is definitely possible to have genes identified with a 3 x 3 comparisons, but I'm not sure of how clear the Ti50 effect is.

I would usually expect limma to be a little more sensitive than the built-in R functions (like aov() for ANOVA and lm() for linear-regression), although some of those pre-processing functions aren't exactly the same as I as expecting.

Perhaps you could try testing some functions this this demo dataset / tutorial?

https://www.bioconductor.org/help/course-materials/2005/BioC2005/labs/lab01/estrogen/

I don't believe I've followed this exact tutorial, but I would expect this dataset to have some genes identified with a low FDR (and usually an example dataset with clear expression changes would be a good idea for initially testing code).