The design matrix i have is this I would like to know how to use they way its done in deseq2 where we can use the contrast function to make particular comparison
The code im running to make differential testing is this
tcgaGlm = lmFit(as.matrix(tcgaExpr), design = tcgaDesign)
tcgaGlm = eBayes(tcgaGlm)
my design matrix looks like this
head(tcgaDesign)
Offset CEBPA DNMT3A FAM5C FLT3_ITD FLT3_TKD IDH1 IDH2 KIT KRAS NPM1 NRAS PHF6 PTPN11 RAD21 RUNX1 SMC1A SMC3 STAG2 TET2 TP53 U2AF1 WT1
TCGA-AB-2803 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
TCGA-AB-2805 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
TCGA-AB-2806 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
TCGA-AB-2807 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
TCGA-AB-2808 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
TCGA-AB-2810 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
complex minus5_5q minus7q plus8_8q plus21 t_15_17 t_8_21 inv16_t16_16
TCGA-AB-2803 0.0000000 0.00000000 0.0000000 0.0000000 0.00000000 1.00000000 0.00000000 0.00000000
TCGA-AB-2805 0.0000000 0.00000000 0.0000000 0.0000000 0.00000000 0.00000000 0.00000000 0.00000000
TCGA-AB-2806 0.0000000 0.00000000 0.0000000 0.0000000 0.00000000 0.00000000 1.00000000 0.00000000
TCGA-AB-2807 0.1410256 0.08974359 0.1217949 0.1153846 1.00000000 0.09615385 0.04487179 0.04487179
TCGA-AB-2808 0.0000000 0.00000000 0.0000000 0.0000000 0.00000000 0.00000000 0.00000000 0.00000000
TCGA-AB-2810 0.1410256 0.08974359 0.1217949 0.1153846 0.05063291 0.09615385 0.04487179 0.04487179
Here if I have to make a comparison between CEBPA vs DNMT3A mutation in samples which carries CEBPA mutation and other which carries DNMT3A what or how should I do it.
To begin with I followed this manual I m still not sure once you run the model do i need to form a design matrix which comprises of sample which contain mutation in CEBPA and DNMT3A ? or I can go ahead with this existing experimental design
a <- dput(head(tcgaDesign,10))
structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0.141025641025641, 0, 0.141025641025641, 0,
0, 1, 0, 0, 0, 0, 0.0897435897435897, 0, 0.0897435897435897,
0, 0, 1, 1, 0, 0, 0, 0.121794871794872, 0, 0.121794871794872,
0, 0, 0, 0, 0, 0, 0, 0.115384615384615, 0, 0.115384615384615,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0.0506329113924051, 0, 0, 1, 0, 1,
0, 0, 0.0961538461538462, 0, 0.0961538461538462, 0, 0, 0, 0,
0, 0, 1, 0.0448717948717949, 0, 0.0448717948717949, 0, 0, 0,
0, 0, 0, 0, 0.0448717948717949, 0, 0.0448717948717949, 0, 0,
0, 0), .Dim = c(10L, 31L), .Dimnames = list(c("TCGA-AB-2803",
"TCGA-AB-2805", "TCGA-AB-2806", "TCGA-AB-2807", "TCGA-AB-2808",
"TCGA-AB-2810", "TCGA-AB-2811", "TCGA-AB-2812", "TCGA-AB-2813",
"TCGA-AB-2814"), c("Offset", "CEBPA", "DNMT3A", "FAM5C", "FLT3_ITD",
"FLT3_TKD", "IDH1", "IDH2", "KIT", "KRAS", "NPM1", "NRAS", "PHF6",
"PTPN11", "RAD21", "RUNX1", "SMC1A", "SMC3", "STAG2", "TET2",
"TP53", "U2AF1", "WT1", "complex", "minus5_5q", "minus7q", "plus8_8q",
"plus21", "t_15_17", "t_8_21", "inv16_t16_16")))
To get the total differential gene expressed I ran this from this paper
testResults <- decideTests(tcgaGlm, method="hierarchical",adjust.method="BH", p.value=0.05)[,-1]
significantGenes <- sapply(1:ncol(testResults), function(j){
c <- tcgaGlm$coefficients[testResults[,j]!=0,j+1]
table(cut(c, breaks=c(-5,seq(-1.5,1.5,l=7),5)))
})
colnames(significantGenes) <- colnames(testResults)
unlike deseq2 where If i don't set my reference level the data or group which is first in terms of alphabetical order becomes my reference or I can set up my reference.
- Which is the reference here in this experimental design ? in limma
How to make condition specific or mutation specific comparison for example CEBPA vs DNMT3A mutation in samples which carries CEBPA mutation and other which carries DNMT3A what or how should I do it or need to create new experimental design of only those mutation which I want to compare or from the design i have created it can be done.
I have already posted here ! any help or suggestion would be really appreciated
naive question where do I pass this into my above code workflow I see my new design matrix in contrast
Please check the limma manual, it covers all that extensively.
ttps://www.bioconductor.org/packages/devel/bioc/vignettes/limma/inst/doc/usersguide.pdf