Hello,
I am new to transcriptomics, I mostly learn by online source and trial and error method. Now, I have 20 samples analyzed for miRNA. I was wondering if I can correct for the batch effects since as you can see below, there is only one sample that contains a unique batch, and for others, they have unequal distributions as well. I already did differential expression analysis using DeSeq2 and edgeR. But these two programs only take raw counts as input. Because by doing so with limma::removeBatchEffect, the data needs to be normalized.
sample treatment batch
1 high 4
2 high 6
3 high 6
4 high 6
5 high 6
6 high 7
7 high 8
8 high 9
9 high 9
10 high 9
11 low 6
12 low 7
13 low 8
14 low 9
15 low 9
16 low 9
17 low 9
18 low 9
19 low 9
20 low 9
dds <- DESeqDataSetFromMatrix(countData = cts, colData = meta, design = ~ batch + phenotype)
I was wondering how can I correct for batch and do differential expression analysis with my data? Thank you!
So, if I include the batch on the design, it already corrects for this variable? I did some PCA as shown below:
It seems like there is also no clear clustering between batches so it has no evidence of batch effect. So in the end, I will just include batch on my design and run DeSeq2?
From what I understand including batch into the design will correct for the differences between the groups that can be explained by batch (so the unwanted batch variation).
I see. Thank you for your replies! Appreciated it.