Entering edit mode
5.6 years ago
Liftedkris
▴
30
Hi fellows,
I performed RNAseq analysis of human neutrophils infected by Aspergillus fumigatus. Everything seemed to work fine but during the differential expression analysis using EdgeR, I used the standard p.value cut off of 0.05 but surprisingly no genes were found for the conditions at that p. Value .
My question is, what could be the problem?
I have 24 samples, in three groups, each group consisting of 8 samples 4 each for control and test i did alignment with Hisat2 and sorted with samtools and used HTseq to generate read counts. at this point everything seemed to work fine.
getwd()
data <- setwd("C:/Users/HP Gamer/Desktop/RNA_seq_read_counts")
group <- factor(c(rep("host.30hpi_control", 4), rep("host.30hpi_infected", 4),
rep("host.90hpi_control", 4), rep("host.90hpi_infected", 4), rep("host.180hpi_control", 4), rep("host.180hpi_infected", 4) ))
library(edgeR)
counts.host <- readDGE(list.files(pattern = ".count"), data, columns = c(1,2))
counts.host$counts <- counts.host$counts[1:(nrow(counts.host$counts)-5), ]
counts.host$counts <- counts.host$counts[rowSums(counts.host$counts > 3) > 2, ]
head(counts.host$counts, 20)
dim(counts.host$counts)
counts.host <- calcNormFactors(counts.host)
design <- model.matrix(~0 + group)
rownames(design) <- colnames(counts.host$counts)
contrasts <- makeContrasts("Host_30hpi" = grouphost.30hpi_infected - grouphost.30hpi_control, "Host_90hpi" = grouphost.90hpi_infected - grouphost.90hpi_control, "Host_180hpi" = grouphost.180hpi_infected - grouphost.180hpi_control, levels = design)
library(limma)
png("host_voom.png")
y <- voomWithQualityWeights(counts = counts.host, design = design, plot = TRUE)
dev.off()
plot.colors <- c(rep("blue", 4), rep("red", 4), rep("orange", 4), rep("black", 4))
png("host_MDS.png")
plotMDS(counts.host, main = "MDS Plot for Count Data", labels = colnames(counts.host$counts), col = plot.colors, cex = 0.9, xlim=c(-2,5))
dev.off()
counts.host.mod <- t(cpm(counts.host))
dist <- dist(counts.host.mod)
png("host_HC.png")
plot(hclust(dist), main="Hierarchical Clustering Dendrogram")
dev.off()
fit <- lmFit(y, design)
fit <- contrasts.fit(fit, contrasts)
fit <- eBayes(fit)
top_30hpi <- topTable(fit, coef = "Host_30hpi", adjust = "fdr", number = "Inf", p.value = 0.05, sort.by = "P")
top_90hpi <- topTable(fit, coef = "Host_90hpi", adjust = "fdr", number = "Inf", p.value = 0.05, sort.by = "P")
top_180hpi <- topTable(fit, coef = "Host_180hpi", adjust = "fdr", number = "Inf", p.value = 0.05, sort.by = "P")
write.table(top_30hpi, file = "Host_DEG_annotated30.csv", sep = ",", col.names = NA)
write.table(top_90hpi, file = "Host_DEG_annotated90.csv", sep = ",", col.names = NA)
write.table(top_180hpi, file = "Host_DEG_annotated180.csv", sep = ",", col.names = NA)
Do you realize that you are not using
edgeR
at all for differential analysis butlimma
? I am not adept withlimma
so I cannot contribute too much but why don't you simply follow exactly theedgeR
manual and then see what comes out?Just a couple of comments:
lmFit
is a function from limma, which is presumably why you chose to use the voom-normalized values for the DE analysiscounts.host$counts <- counts.host$counts[1:(nrow(counts.host$counts)-5), ]
, i.e. why are you removing the last five rows?group
anddesign
objects look like, do you mind sharing them?