Hi,
I want to a question in normalizing the gene RPKM with ERCC spike-in control. I found a paper using loess.normalize() to do that. But I am not sure how to set the parameter in it. When I set the 'mat' object as the rpkm of all genes and the 'subset' object as the rpkm of 92 ERCC spike-in control, I got the error like this:
>loess.normalize(mat,subset)
Error in x[index] : invalid subscript type 'list'
> head(mat)
EL11 EL12 EL14 EL16 EL17 EL18
A1BG 0.1163737 0.1097968 0.00000000 0.000000 0.00000000 0.000000
A1BG-AS1 0.1447295 0.3641338 0.08168055 0.000000 0.26798862 0.000000
A1CF 3.2351127 1.1802152 1.97185377 3.495367 5.63088272 39.181522
A2M 63.2050908 56.7993487 48.27157466 53.446147 77.22512806 81.373673
A2M-AS1 0.7987587 0.8866083 0.95462046 1.570776 2.08803204 2.410139
A2ML1 0.0000000 0.0000000 0.00000000 0.000000 0.07982693 0.000000
> head(subset)
EL11 EL12 EL14 EL16 EL17 EL18
ERCC-00002 185.8553527 166.30548 112.652188 260.00801 391.25896 125.57826
ERCC-00003 8.5380516 11.56205 9.013605 12.70220 23.29576 22.89986
ERCC-00004 119.6550648 145.33323 115.764615 91.67239 296.32214 225.08306
ERCC-00009 23.2876276 27.39052 21.217022 31.72389 45.10257 33.33047
ERCC-00012 0.0000000 0.00000 0.000000 0.00000 0.00000 0.00000
ERCC-00013 0.2543514 0.00000 0.000000 0.00000 0.00000 0.00000
Can anyone help this?
Thanks. Cam
Thanks a lot dpryan, after I do this one, I get another error: "> loess.normalize(mat,subset=which(row.names(subset) %in% row.names(mat))) Error in simpleLoess(y, x, w, span, degree, parametric, drop.square, normalize, : NA/NaN/Inf in foreign function call (arg 1) ". I really have no idea about this error. Can you give some guide about this or some another way to normalize by spike-in controls?
What's the output of which(row.names(subset) %in% row.names(mat))?
I got the same error to above one
I see there is an explanation about this in internet: These can occur if your background subtracted values are <= 0 in one of the channels, leading to a nonsensical log(ratio) value.
But I have no idea how to adjust it.
I've seen some papers that arbitrary add a very small number to all RPKM values prior to log2 normalization. Like log2(x+.01). Alternatively, you could apply your loess normalization to untransformed RPKM values, and just filter/remove genes/transcripts with RPKM < minimum cut off.