RNA-Seq, why estimate the distribution of your data?
1
0
Entering edit mode
8.0 years ago
tmms ▴ 10

Hello

I am new in the field of bioinformatics and have a background in biochemistry.

When doing differential expression analysis on RNA-seq data, edgeR and DEseq2 estimate the distribution of your data. Why? They both do it, so i guess it is important, but i have no clue why they do it.

Thanks in advance.

RNA-Seq edgeR DEseq2 • 1.0k views
ADD COMMENT
1
Entering edit mode
8.0 years ago

Typically there are relatively few replicates, so it's difficult to accurately estimate the variance (or dispersion if you prefer that term) on a per-gene level. Consequently, both tools pool information across genes to get more robust measures. Since the reliability of a p-value is determined in large part by how accurate you've estimated variance, this ends up being a major benefit.

Of course if you have a lot of samples (e.g., a thousand) then this isn't really needed.

ADD COMMENT

Login before adding your answer.

Traffic: 1841 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6