Question

David For Rnaseq Data

2

Entering edit mode

13.3 years ago

Marina Manrique ★ 1.3k

Hi there,

I'm thinking about using DAVID for GO-term enrichment analysis in a set of DE genes from RNAseq. The thing is all the references I've found so far about using DAVID are for microarray data and some of them for CHIPseq data,

Besides, in the paper describing GOseq, they point out that

standard methods give biased results on RNA-seq data due to over-detection of differential expression for long and highly expressed transcripts.

so it makes me wonder if DAVID is really appropriate, have you used it for RNAseq?

Any feedback and reference is welcome :)

Thanks!

gene rna • 9.3k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 13.3 years ago by Marina Manrique ★ 1.3k

score 4 · Answer 1 · 2012-01-04

4

Entering edit mode

13.3 years ago

Sean Davis 27k

David is not really appropriate for count-based data like RNA-seq for the reason that is given in the GOseq paper.

ADD COMMENT • link 13.3 years ago by Sean Davis 27k

2

Entering edit mode

I was going to cite this paper for further exploration, but upon a quick re-skim, it seems they (sadly) didn't do any analysis on the tag count vs. ability to call differential expression. Still useful to peruse, though ...

ADD REPLY • link 13.3 years ago by Steve Lianoglou 5.2k

1

Entering edit mode

Some navel gazing in hopes of being a bit more thorough: if I recall correctly, the problem is the bias in RNA-seq to call differential expression and its relations tip transcript length. If you are using an "Tag-sequencing" method for gene expression analysis which doesn't have this bias (ie. SAGEseq, deepCAGE, or similar), "normal" GO analysis downstream of appropriate differential expression calls should suffice, no?

ADD REPLY • link 13.3 years ago by Steve Lianoglou 5.2k

1

Entering edit mode

This one might be closer? http://genomebiology.com/2010/11/10/R106. More analytically, think of the counts as poisson-distributed (they are not, but the approximation is not too far off for low-counts), so the mean and variance are equal. As the number of counts increases, the distribution becomes tighter. I'm no statistician, but hopefully the point is coming across.

ADD REPLY • link 13.3 years ago by Sean Davis 27k

0

Entering edit mode

Highly-expressed genes are more likely to be called differentially-expressed than low-expressed genes, also. This effect is independent of transcript length bias, so you are still not on firm ground with SAGE or CAGE. I do not know how biased results will be in practice, though.

ADD REPLY • link 13.3 years ago by Sean Davis 27k

0

Entering edit mode

Thanks a lot for the comments!

ADD REPLY • link 13.3 years ago by Marina Manrique ★ 1.3k

Ram · Answer 2 · 2015-04-29

Hi, I was thinking of using DAVID software with RNAseq data BUT only for selecting the common DEGs that appear in different RNAseq experiments with different samples.

I mean, I have RNAseq data from the comparison of two samples, and another RNAseq data from the comparison of two different samples. And I want to know the common DEGs of both RNAseq experiments, without taking into account their functional characteristics. Do you know if DAVID will solve my problem? Or if there is another software to do that?

I only want to know which genes are differentially expressed commonly in all the comparisons.

Ram · Answer 3 · 2015-04-29

0

Entering edit mode

10.0 years ago

andrew ▴ 560

David has not updated its databases since 2009. You might consider iPathwayGuide. It's completely free to try. You can access it from www.iPathwayGuide.com

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by andrew ▴ 560