Question

RNA-seq normalization and differential expression

0

Entering edit mode

9.6 years ago

zizigolu ★ 4.3k

Sorry friends,

I got totally confused, as I understood in a common RNA-seq analysis, for example with tophat, after producing accepted_hits.bam, reads are counted by featurecount or another tools to output a read count matrix. Hereafter if the normalization (for example log transforming) is the second step or producing the RPKM or VST by some tools such as DESe2 is the same differential expression analysis and normalization together? I mean is normalization an independent step before differential expression analysis?

Thank you

RNA-Seq • 4.7k views

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 9.6 years ago by zizigolu ★ 4.3k

2

Entering edit mode

9.6 years ago

Carlo Yague 9.0k

Hi!

Most tools for differential expression analysis handle normalization by themself. In the case of DESeq2 you can see from the documentation that only raw read counts must be used :

The count values must be raw counts of sequencing reads. This is important forDESeq2's statistical model to hold, as only the actual counts allow assessing the measurement precision correctly. Hence, please do not supply other quantities, such as (rounded) normalized counts, or counts of covered base pairs this will only lead to nonsensical results.

PS : be wary that DESeq2 doesn't output RPKM values. RPKM values shouldn't be used in differential analysis anyway.

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 9.6 years ago by Carlo Yague 9.0k

0

Entering edit mode

thank you so much

ADD REPLY • link 9.6 years ago by zizigolu ★ 4.3k

1

Entering edit mode

9.6 years ago

GouthamAtla 12k

differential expression analysis can include the entire pipeline. Its general term. Normalisation is performed before trying to find differentially expressed genes.

ADD COMMENT • link updated 5.4 years ago by Ram 45k • written 9.6 years ago by GouthamAtla 12k

0

Entering edit mode

Thank you, for example for my understanding, is log transformation before using DESeq2?

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 9.6 years ago by zizigolu ★ 4.3k

1

Entering edit mode

If you mean normalization by log transformation, DESeq2 won't work with normalized values.

ADD REPLY • link 9.6 years ago by cpad0112 21k

1

Entering edit mode

9.6 years ago

Damian Kao 16k

DESeq and EdgeR do not perform differential expression directly on normalized values. They calculate a library size factor and use that factor in their downstream differential expression tests.

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 9.6 years ago by Damian Kao 16k

score 2 · Accepted Answer · 2015-10-03

2

Entering edit mode

9.6 years ago

jnf3769 ▴ 40

I'd say normalization is part of differential expression analysis. Depending on the biological context of your data and how you performed the sequencing, you may need to do additional normalization or quality control. Arguably, quality control could be considered part of it too or as a preprocessing step.

ADD COMMENT • link 9.6 years ago by jnf3769 ▴ 40

1

Entering edit mode

You pointed to quality control as pre-processing step. In my work, the initial quality control checking by drawing box plot or MA plot showed that the data are not normalized. In this situation, could you please let me know if I have to do normalization before doing differential expression analysis using say edgeR tool or it should do just on the results of differentially expressed genes to compare samples?

ADD REPLY • link 9.6 years ago by seta ★ 1.9k