GEO .cel file and the preview table are they the same thing?
1
0
Entering edit mode
6.1 years ago
Boboboe ▴ 40

Dear all,

I'm trying to use this data for differential expression analysis. I would like to use the processed data, which seems like it is the .cel file at the supplementary file. On top of that, there's also a data table shown. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM496105

I just have a few question about this. 1. Is the data table a preview of what is in the .cel file? 2. According to data processing, it seems like the content of .cel was already normalized. I was wondering when I get the .cel file, if I just go straight to the analysis. If so, I can't quite figure out how to do that. I used affy package in R:

a = justRMA(filenames = "ctrl_1.CEL", normalize = FALSE, background = FALSE ) write.exprs(a,file="evals.txt")

and the head of the output is https://imgur.com/4v8FSbH (in the first pic) as opposed to in the second pic, which is from geo page.

Please help me understand what is going on and what to do. Thank you all in advance!

RNA-Seq microarray GEO • 1.6k views
ADD COMMENT
0
Entering edit mode
6.1 years ago
Ahill ★ 2.0k

The .cel files are the almost-most-raw intensity data, and contain the probeset-level intensities from the array. So (for example) if there were 11 probe-pairs per probeset on the array, the .cel file would contain 22 intensities per probeset.

According to the GEO page you linked, the data table at GEO shows GC-RMA signal intensities derived from the .cel file by the original authors, one summarized log2-scale intensity per probeset. This is the 'processed data' and it should be suitable for differential analysis.

The output you generated is of course RMA, one log2-scale summarized value per probeset, which is a different summarization method compared to the GC-RMA values that you see in the GEO table that the original authors generated, so you'll get different numbers. For more details on the differences between RMA and GC-RMA you can check the corresponding publications.

If you are comfortable with using GC-RMA as submitted by the study authors, then for gene-level differential analysis, you don't need to retrieve the .cel files and reprocess them using justRMA(). Instead, you could download the GC-RMA tables provided by GEO and analyze those GC-RMA values directly. Unless you are doing something unusual, or have a good reason to do otherwise, that's probably the simplest and best approach for your differential analysis.

ADD COMMENT
0
Entering edit mode

Thank you so much for your quick response! I just can't quite figure out how to download that table from that page :( view full table just show me the table in another page in HTML format and when I try to download it by using "save as" it's still in form of html. How do I download the table?

* edit: actually I figured it out. thank you so much for your help! *

ADD REPLY

Login before adding your answer.

Traffic: 2125 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6