Hi, I am working on RNA-Seq data for Arabidopsis plant. I have samples from 4 genotypes (A, B, C, D), Untreated and after treatment, with 3 replicates for each. I used STAR for alignment and obtained count.txt file by using featureCounts. Now I am doing the DGE analysis by DESeq2. My coldata looks like this:
genotype treatment
A.1_Control A Mock
A.2_Control A Mock
A.3_Control A Mock
A.1_PostDL A Post-HL
A.2_PostDL A Post-HL
A.3_PostDL A Post-HL
B.1_Control B1 Mock
B.2_Control B1 Mock
B.3_Control B1 Mock
B.1_PostDL B1 Post-HL
B.2_PostDL B1 Post-HL
B.3_PostDL B1 Post-HL
C.1_Control C Mock
C.2_Control C Mock
C.3_Control C Mock
C.1_PostDL C Post-HL
C.2_PostDL C Post-HL
C.3_PostDL C Post-HL
D.1_Control D Mock
D.2_Control D Mock
D.3_Control D Mock
D.1_PostDL D Post-HL
D.2_PostDL D Post-HL
D.3_PostDL D Post-HL
I am using this code for analysis:
countdata <- read.table("NewTotalCounts.txt", header=TRUE, row.names=1)
countdata <- countdata[ ,6:ncol(countdata)]
colnames(countdata) <- gsub("\\.[sb]am$", "", colnames(countdata))
countdata <- as.matrix(countdata)
library(DESeq2)
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~genotype*treatment)
dds
dds <- DESeq(dds)
matrix(resultsNames(dds))
"1" "Intercept"
"2" "genotype_C_vs_A"
"3" "genotype_D_vs_A"
"4" "genotype_B_vs_A"
"5" "treatment_Post.HL_vs_Mock"
"6" "genotypeC.treatmentPost.HL"
"7" "genotypeD.treatmentPost.HL"
"8" "genotypeB.treatmentPost.HL"
I extracted the DE genes for each contrast as:
> res_C <- results(dds, contrast="genotype_C_vs_A", alpha=0.05, lfcThreshold=0)
> summary(res_C)
out of 27683 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up) : 115, 0.42%
LFC < 0 (down) : 107, 0.39%
outliers [1] : 2, 0.0072%
low counts [2] : 6909, 25%
(mean count < 4)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results
I am generating the VennDiagram as:
venn(list(B_vs_A=rownames(res_B[which(res_B$padj <= 0.05),]), C_vs_A=rownames(res_C[which(res_C$padj <= 0.05),]), D_vs_A=rownames(res_D[which(res_D$padj <= 0.05),])))
My questions are:
I am not able to understand the contrasts it is making like; contrast "genotype_C_vs_A" is under Mock condition or after treatment; what is "treatment_Post.HL_vs_Mock" - it is for which sample, and "genotypeC.treatmentPost.HL" - it is after treatment vs A or vs C(Mock) ?
How can I extract these DE genes?
How can I extract the DE genes which are common in two or all the contrasts?
Any help from you will be very helpful for me. Thank you
I've reformated your post somewhat to make it more readable. This has mostly invovled setting the output of your R commnads to be formatted as code, which can be done by indenting it 4 spaces or using the code button (marked 101010 in the toolbar).
Thank you so much @ i.sudbery.