Entering edit mode
5.5 years ago
Laura_zz
•
0
I get RMA data from ArrayExpress, and it's single-channel data. I don't know much about gene chips, so I checked some information and tried to run the script provided by the limma package myself, but I don't know if it’s right.
- One step of RMA data is to run exprs(RMA data), but I can't run it, so I use log2(RMA data+1) to replace function exprs(), is that correct?
- Should RMA data processing be processed first with log2 RMA data+1, and then with lmfit in the limma package for differential expression calculation?
Your data is likely already normalised and log2-transformed. Can you please show all data processing commands that you have used?
exprs()
has nothing to do with the RMA process itself.exprs()
is just a function that accesses a 'slot' (variable) in an ExpressionSet R object - this 'slot' contains the expression data.RMA normalisation involves (in this order)
In this commands, I use log2() to replace exprs(). RMA data was derived from ArrayExpress, so if I use this data directly, The fold change in differential expression will be thousands to tens of thousands of times, so I'm confused if the data can be used of differential expression analysis directly.
The protocol description of this experiment showed that The raw data (.probe file) was subjected to RMA (Robust Multi-Array Analysis; Irizarry et al. Biostatistics 4(2):249), quantile normalization (Bolstad et al. Bioinformatics 19(2):185), and background correction as implemented in the NimbleScan software package, version 2.4.27 (Roche NimbleGen, Inc.).
Experiment description link: https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-38130/?keywords=&organism=Oryza+sativa&exptype%5B%5D=%22rna+assay%22&exptype%5B%5D=&array=&page=1&pagesize=500&tdsourcetag=s_pctim_aiomsg
Hey, I am not sure what you mean by this ^
log2()
andexprs()
perform different things - one cannot replace the other. Here is what the description in the manual page ofesprs()
says:It would be easier to use GEO2R to obtain this data - the EBI has no great automated way to obtain published expression datasets - NCBI's GEO does.
ANALYZE WITH GEO2R
R script
tabThere, you will find code to automatically obtain the data:
It seems that, for this project, the data is not log transformed, so, you need to log-transform it like this:
Techniclly speaking, the description given by the authors of their data processing steps is incorrect. RMA normalisation IS a background correction, quantile normalisation, and log [base 2] transformation. So, technically, they have not performed RMA (they only performed 2 steps of it).