I have a total of 8 samples, 4 controls and 4 Foxcut gene over expressed samples.
The column data for all the 8 samples look like below with replicate and cell-line information:
Samples TYPE Replicate Cell-lines
Cell1_HA1 Control 1 1
Cell1_HA2 Control 2 1
Cell1_foxcut11 FOXCUT_OverExpression 1 1
Cell1_foxcut12 FOXCUT_OverExpression 2 1
Cell2_HA1 Control 3 2
Cell2_HA2 Control 4 2
Cell2_foxcut11 FOXCUT_OverExpression 3 2
Cell2_foxcut12 FOXCUT_OverExpression 4 2
I have counts data for all the 8 samples after star
alignment. I'm using edgeR
package for differential analysis. This is the first time I'm doing differential analysis with cell-line data with replicate information. I'm not aware about how to create design matrix
and contrast.matrix
for differential analysis between different samples.
I wanted to compare the below samples and do differential analysis:
Cell1_foxcut samples vs Cell1_HA samples
Cell2_foxcut samples vs Cell2_HA samples
Can anyone please help me how to group the samples and how to create design matrix and how to mention coef
for differential analysis between different samples.
How does a PCA plot of the whole dataset look like? If your cell lines are considerably different (which is very likely), you are better off performing a separate analysis for each cell line.
Please check this:
The plot looks like this
Your samples cluster mainly based on the cell line and not the treatment which is what I would expect for cell lines. Therefore, only compare within the same cell line based on the different treatment but not across cell lines as the confounding effect is probably (most likely) too dominant.
Yes, differential analysis needs to be done within the same cell-line. I edited my question. Could you please tell me how to give the syntax for group, design matrix and contrasts using edgeR? thanq
@ATpoint Hi, could you please tell me how to create design matrix for the differential analysis within the same cell-line
Do you think the below code is right?
I would simply make two separate experiments (
y
) and then use~ TYPE
. As the cell lines are probably quite different from each other, having them in oney
might screw up the normalization factors.May I know how this can be done please. I haven't seen anywhere about this type of analysis, so I'm not at all aware about how to do this.
Instead of importing all 8 samples into R, simply import the first 4 as one object and the second 4 as a second object. Can you show the code that imported the data into R?
Instead of showing in table, I'm showing the counts data for all samples with some genes.
This is the code I used.
@ATpoint Could you please tell me what is wrong in my above code