Hi,
I have a RNAseq data to analyse were I have a control and a one treatment for different individuals. I need to block the effects of the individual, but I am having several troubles to get the data that I need. I am using voom because my data is very heterogeneous and voom seams to do a good job normalising my reads.
I am having the following issues:
I want to get the differentially expressed genes (DEGs) of my treatment not of my control. I don't understand after the eBayes analysis why I get the coefficients for both. I have tried a > makeContrasts (TreatvsCont= c2-co, levels = design) to subtract the control effect but then I get 0 DEGs.
I am not sure when to include the 0 (null model) in the model formula, I have seen examples for both types of models.
This are my targets, with my column names of my counts, individual and condition
targets
Individual condition
A1 1 co
A2 3 co
A4 4 co
A5 5 co
E1 1 c2
E2 2 c2
E3 3 c2
E4 4 c2
E5 5 c2 `
This is the code I have been trying:
co2=as.matrix(read.table("2014_04_02_1h_PB.csv",header=T, sep=",", row.names=1))
nf = calcNormFactors (co2)
targets= read.table ("targets.csv", header = T, sep=",",row.names=1)
treat <- factor (targets$condition, levels= c("co", "c2"))
design <- model.matrix(~0+treat)
colnames (design) <- levels (treat)
y <- voom(co2,design,lib.size=colSums(co2)*nf)
corfit <- duplicateCorrelation(y,design,block=targets$Individual)
fit <- lmFit(y,design,block=targets$Individual,correlation=corfit$consensus)
fit2<- eBayes (fit)
results_trt <- topTable (fit2, coef="c2", n=nrow (y), sort.by="none")
From which gives me 18,000 genes with adj.P.Val < 0.01 out of 22,000 genes that I have in total. Which makes no sense..
Thanks in advance for the help.
With just two groups, you can just
design <- model.matrix(~treat)
. Also, you might try getting rid of "individual 2", since that person isn't in both conditions.Also, what sort of normalization factor distribution do you have? I've seen cases where including one sample with vastly fewer reads than the others has screwed up the analysis due to the normalization factors being off.