EdgeR matrix design and comparisons for paired samples
1
1
Entering edit mode
4.3 years ago
silas008 ▴ 170

Hi guys,

I am a bit confused about the statistics for paired samples in edgeR.

I have 4 different treatments, A, B, C and D, each one with 4 samples. 2 of those samples are "before" treatment and the other 2 are "after" treatment.

If iam correct, checking the edgeR manual, the design of the model matrix should be:

> groups <- factor(targets$Group)
> treatment <- factor(targets$Treatment, levels=c("before","after"))
> design <- model.matrix(~groups+treatment)

But in the case I have a data that is a simple table containing the genes in the first column and de samples in the other columns, how can I construct the model matrix to accept this table format?

I think I can simple open the table as a matrix and atributte the factors to the samples:

> my_table <- data.matrix(my_table, row.names.default(my_table))
> groups <- factor(c(A1,A2,A3,A4,B1,B2,B3,B4,C1,C2,C3,C4,D1,D2,D3,D4))
> treatment <- factor(c("before", "before", "after", "after","before", "before", "after", "after","before", "before", "after", "after",))
> design <- design.matrix(~groups+treatment)
> y <- DGEList(counts=my_table, group=groups)

But I don't know if this is correct.

Does anyone can help with that, I'd really appreciate it.

Thanks

RNA-Seq edgeR • 1.2k views
ADD COMMENT
1
Entering edit mode
4.3 years ago
h.mon 35k

The "correct" way will depend on what A, B, C, D, before and after are, and on what you are interested to test, but it seems to me a better approach (not that what you did is wrong) in your case would be to create a factor combining both group and treatment

Group <- factor( paste( groups, treatment, sep = "." ) )
design <- design.matrix( ~ 0 + Group )
y <- DGEList( counts = my_table, group = Group )

I have 4 different treatments, A, B, C and D, each one with 4 samples. 2 of those samples are "before" treatment and the other 2 are "after" treatment.

If A, B, C and D, are treatments, why do you name the factor which describes them as group? And if before and after are time of sampling, why do you call this factor treatment instead of time? Adequately naming variables will make your code easier to understand.

ADD COMMENT

Login before adding your answer.

Traffic: 2767 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6