Question

Differential Expression Analysis When Most Features Are Not Expected To Be Unchanged?

2

Entering edit mode

12.2 years ago

Ryan Thompson ★ 3.6k

I have an RNA-seq dataset that is looking at small RNAs. I am using blockbuster to group my mapped reads into blocks and then I would like to perform a differential expression analysis on the blocks (based on counts of uniquely-mapped reads assigned to each block in each sample). However, I don't know if tools like edgeR and DESeq will work properly for this, because unlike protein-coding genes, these small RNAs might not be predominantly unchanging in their expression levels.

I think I can use an approach similar to this paper in order to select a set of blocks that I am reasonably certain are not differentially expressed between samples. Essentially, I would be selecting a subset of blocks whose expression ranks change very little relative to each other across all the samples. Question 1: Is this a reasonable approach to selecting a set of genes to use as references for differential expression? Question 2: Assuming that I have such a set of genes that I believe to be not differentially expressed (i.e. proverbial "housekeeping" genes), how can I make use of this information in edgeR or DESeq?

rna-seq • 3.8k views

ADD COMMENT • link updated 12.2 years ago by Sean Davis 27k • written 12.2 years ago by Ryan Thompson ★ 3.6k

0

Entering edit mode

when you mean, "not predominantly unchanging in their expression levels", what % of the small RNAs do you expect (roughly) to be differentially expressed then? And how many small RNAs in total are known/you considering? Because in the paper on "TMM" from Mark Robinson, the claim is that not many of the genes are "differentially expressed" (not having different expression) in spite of the biological variation.

ADD REPLY • link 12.2 years ago by Arun 2.4k

score 0 · Answer 1 · 2012-09-02

0

Entering edit mode

12.2 years ago

Sean Davis 27k

I would suggest trying the edgeR or DESeq routes first. Determining "housekeeping" genes is actually not trivial and probably has as many (or more) pitfalls as the general assumptions in edgeR and DESeq. After performing a more typical analysis, you can always revisit something more complex.

ADD COMMENT • link 12.2 years ago by Sean Davis 27k