RNA-seq: how to handle biological replicates for differential expression analysis
1
0
Entering edit mode
4.4 years ago

I know that you can use tools like DEseq2 or archR for RNA-seq DE analysis. My question is how do you handle multiple biological replicates. For my datasets, I have 3 biological replicates for healthy samples and 3 biological replicates for diseased samples. I understand that we cannot merge biological replicates, then how do we use these 6 different datasets in DE analysis? I hear about false discovery rate, but does that mean we check for every possible healthy-diseased pair?

RNA-Seq • 1.2k views
ADD COMMENT
3
Entering edit mode
4.4 years ago

Let's say that you have 2 conditions with 3 biological replicates each WT-1, WT-2, WT-3, KO-1, KO-2, KO-3. Samples of the same type will be labelled with the same name/factor level, so the factor level of your samples becomes WT, WT, WT, KO, KO, KO.

Both edgeR and DESeq2 will take some sort of design argument. In DESeq2 for example it will be a data.frame with samples as rownames and then columns for your various factor levels. For our example the data.frame would look like this:

> df
     condition
WT-1        WT
WT-2        WT
WT-3        WT
KO-1        KO
KO-2        KO
KO-3        KO

Your regression formula will then be ~ condition for this example dataset for differential expression.

ADD COMMENT
2
Entering edit mode

To this excellent answer I just wanted to add that DESeq2 has many useful and detailed tutorials. For example:

http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html

ADD REPLY

Login before adding your answer.

Traffic: 3200 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6