edgeR make contrast
1
1
Entering edit mode
5.8 years ago
Bnf83 ▴ 150

Hi guys,

I'm dealing with a unusual analytical requirement: I'm performing differential gene expression of some human samples.

I have neither controls nor replicates. The samples are grouped into two groups, let's say: A and B. Now I have to compare A vs B+A. Normally I compare A vs B. Never happened A vs B+A. Using edgeR how can I do this contrast?

I usually use:

con <- makeContrasts(A - B, levels=design)

Thank you in advance

RNA-Seq edgeR contrasts • 4.4k views
ADD COMMENT
4
Entering edit mode

isn't A versus B+A just a contrast on B-versus-zero? All you'll get is a summary of average expression

ADD REPLY
3
Entering edit mode

Yes. If you wish to compare the expression in A to the expression in A plus the expression in B you are really testing if the expression in group B is zero:

H0: A = A + B

=> A - A = A + B -A => 0 = B

Do you mean that some samples have received treatment A and some samples have received treatment A and B?

ADD REPLY
0
Entering edit mode

Unfortunately they do not receive a treatment or a specific condition and for this reason it seems strange a requirement of this type to me. Anyway: I think they would like the relative expression value of A, i.e. a sort of delta of A over A+B. I cannot figure out the rationale.

ADD REPLY
0
Entering edit mode

I don't mean to pry, but you couldn't give us a bit more detail about the actual study could you? It may simply be that your collaborators have made a mistake in explaining what they want you to compare - they may be asking you to compare expression in the set A against expression in the set A u B, for example, IMO biologists / medics don't talk in terms of fitted coefficients.

ADD REPLY
0
Entering edit mode

No problem!!! I have around 100 breast cancer samples (primary). They performed RNA seq and then they clustered the samples identifying clusters (that here I called groups) of patients. Then They asked me the comparison I already explained.

ADD REPLY
0
Entering edit mode

Were they clustered using one subset of the genes, and now you're running diffex on a separate subset of the genes?

ADD REPLY
0
Entering edit mode

I think you need to talk to them about what their biological question is. There are screen methods where things like A/(A+B) are used as a measure of effect size, but you would test is this was equal to zero, not if A was equal to A + B. And I can't see this being a meaningful comparison in something like RNAseq.

ADD REPLY
0
Entering edit mode

I totally agree with you!

ADD REPLY
0
Entering edit mode

Your post is not a Tutorial, it is a Question, please use the appropriate category.

Let me see if I understood this correctly: you have one sample in group A, one sample in group B, and you want to compare A vs B+A? Does this even make sense?

ADD REPLY
0
Entering edit mode

I agree with you....anyway, suppose you have a gene "x" and you want to perform the DGE analysis on 10 samples of group A and 13 of group B. They asked me: DGA on A versus B plus A itself, i.e. if the expression of x in A is 40 and in B is 20, in A+B is 60, Finally the DGE will be 40 vs 60. Although it is mathematically clear, it is difficult to me to write properly the contrast.

ADD REPLY
2
Entering edit mode
5.8 years ago

From the discussions above, I believe you are being asked to test the difference between those samples in group A compared to an average of all the samples. This is, in fact, the traditional (before R) way to test contrasts.

This, it turns out, is actually fairly easy to code up, and simply relies on using a different contrast encoding system for your model matrix.

Create a conditions frame/factor for your groups (A or B) and set its contrasts model to contr.sum:

cluster <- factor(c("A", "A", "B", "B")
contrasts(cluster) <- contr.sum(2)

You can now create your model matrix as usual:

design = model.matrix(~ 1 + cluster)

When you fit your linear model, your will fit two coefficients, one is the intercept (that is effectively A+B) and the other is the difference for A (or membership of cluster A). There is no need to fit a contrast in your edgeR workflow, the coefficient of interest will be coef=2 in your glmLRT.

ADD COMMENT
0
Entering edit mode

You may also want to see this paper: https://www.biorxiv.org/content/10.1101/463265v1 Which explains why differential expression testing, post clustering, is a bad idea and has some suggestions for alternatives.

Personally, I'd try to bi-cluster, find the gene cluster that was driving the sample cluster and have a look what genes were in it.

ADD REPLY

Login before adding your answer.

Traffic: 2689 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6