Question

How to find differentially expressed genes between three groups?

1

Entering edit mode

5.9 years ago

John ▴ 270

Hi there,

How to find differentially expressed genes between three groups? Basically I want to know what are all the genes are expressed shows maximum variation between three groups.

Thanks

RNA-Seq rna-seq ngs r R • 7.3k views

ADD COMMENT • link updated 5.9 years ago by daniele.avancini ▴ 70 • written 5.9 years ago by John ▴ 270

0

Entering edit mode

Use likelihood ratio tests in edgeR.

ADD REPLY • link 5.9 years ago by russhh 5.8k

score 2 · Answer 1 · 2019-03-28

2

Entering edit mode

5.9 years ago

daniele.avancini ▴ 70

Using classic tools (Deseq2, edger.. ) is not possible, they usually allow the comparison between two groups at a time. And rightly so, in my humble opinion since its the nature of differential expression which has only two dimensions (up or downregulated). However, you can use some tricks such as comparing 1 vs all-others or ANOVA. You can see an example here.

ADD COMMENT • link 5.9 years ago by daniele.avancini ▴ 70

0

Entering edit mode

Strictly speaking, I think you can sometimes have a p-value with a categorical variable with more than 2-levels, but you can't have a fold-change value.

So, I like your ANOVA example, but I'm don't think it is correct to claim that no method for calculating RNA-Seq p-values can handle a categorical value with more than 2 levels. For example, I know DESeq2 / edgeR / limma-voom can all handle continuous variable analysis (even though I was previously confused about this): so, I'm 100% certain that you could make a comparison if you converted your categories into a numeric score.

ADD REPLY • link 5.9 years ago by Charles Warden 8.3k

0

Entering edit mode

My apologies, I did not know you could and thank for pointing it out. Could you elaborate a bit more on how to convert categories into numeric scores? Anyway, I said that it was not possible because I thought you needed a fold change and the significance of that fold change in order to get to any conclusion on differential expression and not only the statistics. I personally wouldn't use the ANOVA example I gave earlier since I believe that the most reliable results will be obtained with the old-fashion 2-way differential analysis.

ADD REPLY • link 5.9 years ago by daniele.avancini ▴ 70

1

Entering edit mode

R can do this automatically with the as.numeric() function, but I'm guessing that isn't what you want.

So, I don't know if this is relevant to your comparison, but here is a general example of what I mean:

Let's say you have 3 categories:

1) Over-Expression

2) WT

3) Knock-Out

You can define a new numeric variable where you map numbers to the categories. For example, with those categories, maybe something like this would make sense:

"Over-Expression" --> +1

"WT" --> 0

"Knock-Out" --> -1

ADD REPLY • link 5.9 years ago by Charles Warden 8.3k