Rank Normalization of genes in Expression Data
1
0
Entering edit mode
9.3 years ago
Ron ★ 1.2k

Hi all,

I want to do RANK based normalization in RNA seq expression data,basically ranking genes in a sample according to FPKMS and dividing by the total number of genes to get a normalized expression value between 0 and 1.

This has to be done across multiple samples.Is there any method to do this?

I have used this method,seems to work,but want to look at rank based method.

gene_min=apply(df, 1, min)
gene_max=apply(df, 1, max)
df_norm=(df-gene_min)/(gene_max-gene_min)​

Thanks,
Ron

RNA-Seq R • 15k views
ADD COMMENT
0
Entering edit mode

Hi Ron, what have you tried. A very simple method will be to apply a small filter (threshold) on the FPKM values, scale the genes between 0-1 and then just sort, very easy to do in R. Let us know, if you get stuck, I can write a function for you.

ADD REPLY
0
Entering edit mode

I have added the method I am using,which is a bit different although it scales values between 0 and 1.

ADD REPLY
4
Entering edit mode
9.3 years ago
Kamil ★ 2.3k

Perhaps you're trying to do something like this?

> mat
          1        2        3        4        5
2  6.890809 6.744169 6.642575 6.649212 6.756785
9  4.303356 4.250599 4.245089 4.193621 4.471561
10 3.739968 3.823797 3.942015 3.850949 3.699985
12 8.237043 8.233598 8.315632 8.354951 8.472915
13 3.051626 2.983962 2.997821 3.017578 2.966767

> apply(mat, 2, function(y) rank(y) / length(y))

     1   2   3   4   5
2  0.8 0.8 0.8 0.8 0.8
9  0.6 0.6 0.6 0.6 0.6
10 0.4 0.4 0.4 0.4 0.4
12 1.0 1.0 1.0 1.0 1.0
13 0.2 0.2 0.2 0.2 0.2
ADD COMMENT
0
Entering edit mode

Yes,thats what I was looking for.works awesome !!!

ADD REPLY

Login before adding your answer.

Traffic: 2420 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6