Entering edit mode
6.0 years ago
Vasu
▴
790
Hi,
I have 8 RNA-Seq samples. Among them 4 are controls and other 4 are treatment. I'm interested in doing differential analysis with edgeR. Following is the column data.
Samples Type
Sample1 Control
Sample2 Control
Sample5 Control
Sample6 Control
Sample7 Treatment
Sample8 Treatment
Sample3 Treatment
Sample4 Treatment
Among the above table Sample1, Sample2 [Controls] and Sample3, Sample4 [Treatment] are done on one day and Sample5, Sample6 [Controls] and Sample7, Sample8 [Treatment] are done on other day.
As you see the replicates were not processed together, there is the batch effect. In this way how I can create the design matrix in edgeR for differential analysis.
Could you please tell me which section I should check for this.
The section about batch effect. But reading from the start is also wise...
I had a look into it. This is the first time I'm working with such data. Could you please tell me whether this is right or not.
I crated design matrix like following:
And the design looks like below:
Then i have used following commands for linear model fit and DEA.
Do you think this is right?
It looks alright, but one question. You call the batch replicate, does that mean they were from the same sample? Is it technical replication?
Yes, they were the same sample but RNA extraction is done on the next day.
As mentioned above Sample1, Sample2 [Controls] and Sample3, Sample4 [Treatment] are done on one day and Sample5, Sample6 [Controls] and Sample7, Sample8 [Treatment] are done on other day
All the samples are from the same cell-line.
How can you be sure that this '
batch
' effect is going to bias your results? - what evidence have you seen? In many experiments, samples are processed on separate days with minimal or no effect on the end results. If we did a time-course experiment, for example, and assumed thattime
was a batch effect, then we would wipe out the very differences that we wanted to find based ontime
.