Strange peaks in normalized FPKM data
1
0
Entering edit mode
4 months ago
Cathy • 0

Hi!

This is my first time asking a question on this forum. My apologies for any (obvious) mistakes.

Background: Gene expression data in FPKM format 48 hours after stim. Eight samples (4 HD and 4 sick). Columns are samples, rows are genes.

FPKM data was log2 transformed, all values <1 were filtered out and genes with too many NA's too (at least 3 values) .

After this the values are normalized using Z-score normalization, I use the following steps:

SDs <- apply(x,1,function(x){sd(x,na.rm = T)})
means <- rowMeans(x, na.rm = T)
RNA_log2_FPKM_cleaned <- (x - means) / SDs

Peaking at the data via a histogram results in the following:

T48 histogram after preprocessing, what are those weird peaks?

These peaks are at places -0.707 and 0.707. They were not all the same value before the Zs-score normalization (as they were different genes). Have I done something wrong? Thanks in advance for any help I can get.

gene-expression FPKM • 459 views
ADD COMMENT
2
Entering edit mode
4 months ago
ATpoint 85k

How can a gene have NA counts in RNA-seq? I can be 0, but not NA. The 0.707 is suspicious because that's the value you get when scaling just two numbers that are different, e.g. scale(c(1, -1)). I guess you have rows with just two values not being NA. Again, where do the NAs come from?

By the way, the easier way of scaling a matrix is t(scale(t(x))).

ADD COMMENT
0
Entering edit mode

I suspect the same " guess you have rows with just two values not being NA"

ADD REPLY
0
Entering edit mode

Thank you for your help and info!

The NA's are a result of filtering all values below 1 (all <1 <- NA). I thought the next step would remove all NA's with less than 3 values, but I will edit it to actually do that.

I will also scale the matrix in the easier method. Thanks again!

ADD REPLY

Login before adding your answer.

Traffic: 1661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6