Hello all,
I am using DESeq2 and trying to compare my 3 controls vs. my 3 knockdowns and my 3 controls vs. my 3 overexpressions, Im trying to use the design feature but it isn't giving me the right comparisons. I have my count matrix like this:
transcript_id sample1 sample2 sample3 sample4 sample5 sample6 sample7 sample8 sample9
ENST00000542671 0 0 0 0 0 0 1 0 0
ENST00000496116 21 25 24 19 23 30 19 11 27
ENST00000496117 3 4 3 3 3 4 2 3 3
ENST00000496114 0 0 0 0 4 0 0 0 0
ENST00000496115 0 0 0 38 60 64 0 0 0
ENST00000496112 754 598 1910 99 339 423 304 468 290
ENST00000496113 0 0 0 0 0 0 0 0 0
and my current samples are set up like this:
name type
sample1 Control1
sample2 KD1
sample3 OE1
sample4 Control2
sample5 KD2
sample6 OE2
sample7 Control3
sample8 KD3
sample9 OE3
I tried using this code to have the samples included in the results output:
countMatrix <- as.matrix(read.csv("transcript_count_matrix.csv", header = T, row.names = 1))
coldata <- data.frame(row.names = colnames(countMatrix),group = rep(c("sample1","sample4","sample7","sample2","sample5","sample8","sample3","sample6","sample9"),1,each = 1),
treatment = rep(c("control","KD","OE"), each = 3))
coldata$treatment = factor(x = coldata$treatment,levels = c('control','KD','OE'))
dds <- DESeqDataSetFromMatrix(countData = countMatrix, colData = coldata, design = ~ group )
#___________
dds$group<- factor(paste0(dds$group,dds$treatment))
design(dds) <- ~ group
dds <- DESeq(dds)
resultsNames(dds)
#--------------
dds <- DESeq(dds)
resultsNames(dds)
I want output like this but want each conditon/sample type to be included in this output, so basically match up the value with the sample number. I then would like to input this into GSEA to be show which genes are enriched in Control(3) vs. Knockdown(3). Each having 3 samples.
geneID baseMean log2FoldChange lfcSE stat pval padj
ENST00000462898 801.163658 -13.0500301 3.351082927 -3.894272503 9.85E-05 0.091581957
ENST00000397492 1374.439042 8.500577708 2.211868188 3.843166493 0.000121457 0.091581957
ENST00000482918 655.4428802 -12.72048174 3.3108725 -3.842033103 0.000122019 0.091581957
MSTRG.2890.8 691.0693534 -12.46673598 3.322664026 -3.752030264 0.000175408 0.091581957
ENST00000543146 877.2128689 -12.16322652 3.207940824 -3.791599406 0.00014968 0.091581957
ENST00000376444 648.0586781 12.5303438 3.262528563 3.840684782 0.000122692 0.091581957
ENST00000322157 1711.77583 12.37487207 2.964221221 4.174746467 2.98E-05 0.091581957
ENST00000493568 1021.336235 -13.45675527 3.592267718 -3.746033515 0.000179653 0.091581957
ENST00000545822 3916.906123 -8.135649057 2.127479894 -3.824078 0.000131262 0.091581957
MSTRG.26125.15 420.6950308 -11.67749516 3.135518818 -3.724262502 0.000195887 0.091581957
I get this currently for dds which only compares sample1 with everything else also.:
resultsNames(dds)
[1] "Intercept" "group_sample2_vs_sample1" "group_sample3_vs_sample1" "group_sample4_vs_sample1"
[5] "group_sample5_vs_sample1" "group_sample6_vs_sample1" "group_sample7_vs_sample1" "group_sample8_vs_sample1"
[9] "group_sample9_vs_sample1"
Would anyone know who I can accomplish this? Using this to input into GSEA is my goal and so far I haven't been able to get what I wanted, only a preranked list. Any help would be appreciated!
Thanks, Bryce