Entering edit mode
2.6 years ago
Aki
▴
20
hello I'm trying to analyze microarray data (affymetrix.genechip.hg-u133_plus_2). but my data has no genes with adj.P.Val < 0.05 using limma. Could you tell me what are there some wrong code or my sample is bad? my paired sample (5 sample/control (before) and treatment (after)).
head(data)
CTRL_1 CTRL_2 CTRL_3 CTRL_4 CTRL_5 Treat_1 Treat_2 Treat_3 Treat_3 Treat_4 Treat_5
AFFX-BioB-5_at 3649.6 1649.3 1838.3 2103.8 2848.2 2861.1 1717.9 1819.6 2327.9 2621.5
AFFX-BioB-M_at 4726.6 2108.9 2550.0 3696.8 4226.2 3643.5 2260.9 2578.2 3408.9 4177.5
AFFX-BioB-3_at 3179.6 1490.6 1545.3 2186.2 2613.6 2598.8 1400.5 1653.3 2389.5 2633.4
pairinfo <- factor(rep(1:5,2))
group_list <- c(rep("CTRL",5), rep("Treat",5))
group_list <- factor(group_list,levels = c("CTRL","IR"),ordered = F)
design <- model.matrix(~ pairinfo+group_list)
design_CTRL
(Intercept) pairinfo2 pairinfo3 pairinfo4 pairinfo5 group_list
1 1 0 0 0 0 0
2 1 1 0 0 0 0
3 1 0 1 0 0 0
4 1 0 0 1 0 0
5 1 0 0 0 1 0
6 1 0 0 0 0 1
7 1 1 0 0 0 1
8 1 0 1 0 0 1
9 1 0 0 1 0 1
10 1 0 0 0 1 1
fit <- lmFit(data,design)
fit <- eBayes(fit)
DEG <- topTable(fit, adjust = 'BH',coef = 6, number = Inf)
summary(DEG_CTRL)
logFC t P.Value adj.P.Val B
Min. :-8722.22 Min. :-12.2441 Min. :0.0000383 Min. :0.3105 Min. :-4.613
1st Qu.: -34.62 1st Qu.: -1.1654 1st Qu.:0.1315141 1st Qu.:0.5260 1st Qu.:-4.608
Median : -2.90 Median : -0.1378 Median :0.3569465 Median :0.7139 Median :-4.595
Mean : 30.27 Mean : -0.1634 Mean :0.4074041 Mean :0.6982 Mean :-4.587
3rd Qu.: 24.41 3rd Qu.: 0.8689 3rd Qu.:0.6618781 3rd Qu.:0.8824 3rd Qu.:-4.572
Max. :15670.46 Max. : 14.0514 Max. :1.0000000 Max. :1.0000 Max. :-4.511
It may just mean that the treatment has no effect on gene expression
also I believe DeSeq2 is better (more powerful) for such small (it is considered quite small) sample sizes
DeSeq is for RNA-seq, this is microarray data. I rather suspect there is something wrong with the experiment design or the formula. But I don't know what the real design is and what the factors are. However, for all factors except group you seemingly have only 2 of 10 samples for value 1. I think you should start with a simpler design, just CTRL vs Treat and see what happens. If the pair_info data denote the same patient, you should rather encode them in a single factor with patient id. Also then a more appropriate model formula would be
~group + group*patient
. the more I think about it, I believe it is your model matrix that is the cause.I think it's because you didn't bother check the output from your own code. Try checking the output/group_list posted in your code:
Please always keep analysis simple, when you are testing your code and your data.