Affy Readbatch And "Does Not Seem To Have The Correct Dimensions"
2
0
Entering edit mode
13.3 years ago
vodka ▴ 80

Hi,

I'm trying to normalize some data gotten from GEO with the R affy package but some of the CEL files don't make ReadAffy happy:

"Cel file cels/GSM362260.CEL.gz does not seem to have the correct dimensions"

while others are ok. I googled around and I saw that they could just be truncated files (but read.table on those files does not complain) or CEL that uses different CDFs, but that is not the case. What could be the problem? Moreover is it possible to automagically exclude these files, ie trying to add them one by one to an AffyBatch object (sorry for the lame question, just point me to the right documentation, I'm new to affy and even to R)?

r microarray affymetrix • 7.5k views
ADD COMMENT
1
Entering edit mode
13.3 years ago

It looks like the superseries to which this CEL file belongs on GEO (see here) contains both mouse and rat arrays. Is it possible you're intermingling them? Affy would probably peg the CDF geometry off of the first CEL it happens to read.

If that's not it, check the first 20 lines of a CEL it will read (you can just use more on the CEL file and look at the header) and compare that to the problematic CEL file. The header information should look similar though not identical to a working one.

If you're using ReadAffy to load your cell files, then you can pass a list of CEL files to read rather than sucking up all files in the working directory. Alternately, you can rename a file "foo.CEL.doesNOTwork", but it would be better to figure out what the real problem is.

ADD COMMENT
0
Entering edit mode
13.3 years ago
vodka ▴ 80

Yup, thanks (even if I prefer less :) ). They were all mouse data, I'm already passing a list of cel files to ReadAffy and not a single directory: I would like something like a try/catch around single cel loading to solve this problem once and for all (the R script which normalizes should be able to detect the ``wrong'' cels, print out a warning and go on with the good ones) without manual intervention - I will read a little bit more of the affy documentation to see if it's possible. Let's see these headers...

...and here it is, my fault (what a surprise...): it appears that the CELXXX_RAW.tar for GSE14499 even if they are not in the filelist and are in a different GSE also has rat data and I didn't double check its contents. Thank you for your help, next time I won't trust .tar from a single GSE :)

ADD COMMENT
0
Entering edit mode

It might not be straightforward to implement code that detects which ones are the "wrong" ones when there are no "wrong" ones, only a mix of different CELs (your "wrong" ones might be someone else's "right" ones). Explicit is better than implicit and the function expects an homogeneous set of CEL files and checks for this.

ADD REPLY

Login before adding your answer.

Traffic: 2105 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6