Hello:
I am processing the raw data on the CMAP where there are a lot of row data using the .CEL format. I know I can read the value using the R software. But ,what I want is to use python to process .CEL data. I have learned that there is a package called biopython which can process CEL data. Could anyone know the detail of how to process .CEL data using python? The following is my code to process .CEL data using python. But there is something wrong.
from Bio.Affy import CelFile
with open('AGENT_p_NCLE_RNA6_HG-U133_Plus_2_A01_436578.CEL') as handle:
c = CelFile.read(handle)
print c
print(c.ncols, c.nrows)
The result is as the followings:
<Bio.Affy.CelFile.Record object at 0x02534730>
(None, None)
What is the wrong with my code? And using R, the CDF is used however in python it is not used,why?
It would be nice of you to answer my problem.
Your code looks fine but are you sure the file exists in the same directory and has some contents ?
I am sure they are in the same directory and the .CEL data have contents and can be run by R
Can you try the same code on the CEL file given in the BioPython repo?
https://github.com/biopython/biopython/tree/master/Tests/Affy
Here the is download link: affy_v3_example.CEL
When I downloaded the data affy_v3_example.CEL
The result is as follows:
It is fine,but what is the problem with my data?
And I have open affy_v3_example.CEL, It is the data that is processed, I think. Because the CEL data is raw data about the probe set. And My cel data is messy code.
Also I am confessed with the method python used, since .cel contained a lot of probes which means that it should need CDF. And this is done right using R. However, in python it does not matter CDF. How can it done?