Hello all,
I have the normalized counts from RNA-Seq analysis in 6 populations. We are waiting for the triplicates, but I have discover that limma can do the differential expression analysis of the genes in one population vs the others (and the results seems to work pretty well!). Is it statistically relevant? Which statistics use (since it cannot do t.test!)?
Please use google and the search function, e.g. to find this post at BioC, on that matter, which has been asked and discussed intensively many times before.
Thanks, but the point discussed there or in the other discussion I have found it is not related to my question. I was asking the statistics that allow limma to do the analysis of 1 vs others, no the convenience of using replicates. We are obvoulsy working on them, but to the date we only have the data from one patient.
If anyone knos about the statistical analysis and if the p val calculated by limma in this conditions are correct, please, let me know.
The first line of the first answer to the linked post says "You can't use voom or limma without replicates." This means that the author of limma and one of the leader genomic biostatisticians in the world believes that these results are not meaningful.
If you really want to look for meaningful analysis of a situation where you have many samples and want to know if one stands out from the rest, look up Z-score/outlier analysis.
I think you may have misinterpreted limma's abilities slightly, because the sample size requirements for t-tests (with pooled variance) and for limma are essentially the same. Actually limma's sample size requirements are exactly the same as for anova.
If I understand your question correctly, you have 6 populations and six RNA samples, and you have conducted a test by pooling 5 of the populations as one group. Whether or not it makes sense to pool 5 populations together is a biological decision, not a statistical decision. If you are happy to treat the 5 populations as replicates, however, then there is no mathematical difficulty with conducting either a t-test or a limma moderated t-test for n=1 sample vs n=5 samples. The test will be conducted relative to the variability observed between the 5 populations that are being treated as one group.
No the results are not statistically relevant. Statistics don't work on n=1 experiments, even if you get results, they mean nothing. Wait for more replicates to come, and try to include a batch factor for the different sequencing runs.
Please use google and the search function, e.g. to find this post at BioC, on that matter, which has been asked and discussed intensively many times before.
Hello,
Thanks, but the point discussed there or in the other discussion I have found it is not related to my question. I was asking the statistics that allow limma to do the analysis of 1 vs others, no the convenience of using replicates. We are obvoulsy working on them, but to the date we only have the data from one patient.
If anyone knos about the statistical analysis and if the p val calculated by limma in this conditions are correct, please, let me know.
The first line of the first answer to the linked post says "You can't use voom or limma without replicates." This means that the author of limma and one of the leader genomic biostatisticians in the world believes that these results are not meaningful.
If you really want to look for meaningful analysis of a situation where you have many samples and want to know if one stands out from the rest, look up Z-score/outlier analysis.