How to handle Log2 Transformation that creates Nan Values
0
1
Entering edit mode
6.6 years ago

Hi everybody I have a problem with analyzing micro array data. When i try to do log2 transformation to make "min and max" range normal, it gives me Nan Value so i can not continue the analysis. The codes i use are:

ex<-exprs(gset)
dim(ex)
ex<- log2(ex+1)
exprs(gset) <- ex
min(ex)
max(ex)

** when i add a bigger value to ex (like ex+5) it doesn't give me Nan value anymore. is it wise to do that? If not, Plz help me to get rid of Nan Value . Tnx a lot

P.S= I don't have a very high skill in analyzing and R programming so if u can give me a code that can solve my problem i really appreciate it.

R microarray limma GEO • 6.3k views
ADD COMMENT
0
Entering edit mode

what is the range of the exprs matrix before you apply log-transformation?

ADD REPLY
1
Entering edit mode

You should not encounter this issue with microarray data because the QC and normalisation methods for microarray analyses are very 'mature' and stable. Please show your code that tells us how you created the gset object - the data in exprs(gset) may already be quantile normalised and transformed by log base 2.

if you do hist(exprs(gset)), how does the result look? - paste the image here via imgbb.

Thank you!

شكرا

ADD REPLY
0
Entering edit mode

Hi Kevin thanks for your reply! this is my code for creating gset object in GSE29226 dataset.

 gset<-gset[[1]] 
    ex<-exprs(gset)

It's not normalized yet because it gives me

min(-35.7876) and max(59600.26)

.

ADD REPLY
1
Entering edit mode

It looks like that data may already be normalised, if obtained this way:

library(Biobase)
library(GEOquery)

# load series and platform data from GEO

gset <- getGEO("GSE29226", GSEMatrix =TRUE, getGPL=FALSE)
if (length(gset) > 1) idx <- grep("GPL6947", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

# set parameters and draw the plot

dev.new(width=4+dim(gset)[[2]]/5, height=6)
par(mar=c(2+round(max(nchar(sampleNames(gset)))/2),4,2,1))
title <- paste ("GSE29226", '/', annotation(gset), " selected samples", sep ='')
boxplot(exprs(gset), boxwex=0.7, notch=T, main=title, outline=FALSE, las=2)

hist(log(exprs(gset))

bo

hist

ADD REPLY
0
Entering edit mode

Thanks a lot Kevin. I thought i should check min(ex) and max(ex) to be sure about log2 transformation. I didn't know that i can draw log plot. By the way, do you know what's the benefit of knowing min and max(ex)? Because in my analysis they are -21.7958 and 37248.49 and seems to be high...

ADD REPLY
1
Entering edit mode

Knowing min and max is not too relevant in this regard, in my opinion. Min and max values will differed a lot across different microarray types and versions.

I believe that such a range is typical for this array type, i.e., the Illumina HT Expression BeadChip. For your study of interest, the true raw data is accessible via a link at the bottom of the GEO Accession record:

g

[source: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE29226]

The data that you obtain via my code is the unlogged, normalised counts, from what I understand. You should confirm this with the authors of the study.

ADD REPLY
0
Entering edit mode

I didn't pay attention to supplementary data. Thanks a lot again dear Kevin for your help.

ADD REPLY
0
Entering edit mode

Hi dear rush. These are info you wanted in GSE29226.

dim(ex)

[1] 48803 24

min(ex)

[1] -35.7876

max(ex)

[1] 59600.26

ADD REPLY

Login before adding your answer.

Traffic: 2100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6