Log2(x + 1) transformation in gene expression not normally distributed.
2
0
Entering edit mode
6.2 years ago
rin ▴ 40

Hi all!

I am using raw counts data from TCGA. As I want to compute the Z-score between tumor and normal samples, I have to first ensure that my data are normally distributed. Until now, I downloaded raw counts, normalized them for their GC content using TCGAanalyze_Normalization() function from TCGAbiolinks, log2(x+1) transfromed them but the distribution is right skewed and definetily not normal, as seen in qqnorm() plots.

Commercial Photography

How could I tackle that? I have been trying to figure it out for days, but I cannot find a solution.

Thanks a lot, R.

transformation RNA-Seq Z-score • 6.3k views
ADD COMMENT
0
Entering edit mode

Could you reattach the link to your plot, please

ADD REPLY
0
Entering edit mode

Edited! Sorry about that! :)

ADD REPLY
2
Entering edit mode
6.2 years ago
Benn 8.3k

Some data can not be transformed into a normal distribution. RNA-seq count data fits a Poisson distribution or a negative binomial distribution. There is a great answer here about how RNA-seq data is distributed.

ADD COMMENT
1
Entering edit mode

RNA-Seq is typically fitted to a Poisson or NB-distribution. Claiming that it fits those distributions is a bit strong though.

ADD REPLY
1
Entering edit mode
6.2 years ago

This is expected, RNAseq data should be right-skewed or multimodal.

ADD COMMENT
0
Entering edit mode

@Devon Ryan @b.nota @russhh Really helpful link and answers! Thank you! The reason I want them to be normally distributed is to assess the change between tumor and normal expression by computing a Z-score. Would that be possible / have the same interpretation if they fit a Poisson or NB distribution?

ADD REPLY
0
Entering edit mode

Try to use limma or edgeR for this kind of analysis.

ADD REPLY

Login before adding your answer.

Traffic: 1694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6