I am working on a custom CDF: Brainarray HGU133Plus2_Hs_ENTREZG_v17.
I want to convert the probe names to Ensembl names. I have a few specific questions; if instead anyone prefers to explain the right way of doing that, I'd also be glad.
1) Just to get it staight : the ENTREZG does not mean that the CDF itself is in anyway specific to ENTREZ, is that correct? I prefer working in Ensembl, so can I safely use HGU133Plus2_Hs_ENSG_v17 from Brainarray's website?
When I go to Brainarray's website and download the "CDF Seq Map Desc" file (last column), I see lines like that :
Probe Set Name Chr Chr Strand Chr From Probe X Probe Y Affy Probe Set Name
ENSG00000000003_at X - 99884769 1019 717 209109_s_at
ENSG00000000003_at X - 99884536 1054 679 209108_at
2) Does "Affy Probe Set Name" (the last column) stand for the probe set names of the Brainarray custom cdf?
3) What does the "probe set name" (first column) mean? Are they simply Ensembl names (those that I need?).
I'm unsure of the specifics of the BrainArrays, but they are essentially the same as the Affy 'chips' on which they are based. So, the probe-set names are Affy probe IDs. In your case, the underlying chip was Affy U133 Plus 2. So, you can easily obtain extra annotation like this:
Kevin
But part of the idea of custom CDF is to remap the probes to a more current genome annotation (as well as to deal with the problem of many probes that represent the same gene).
Here
If the annotation is identical to the original Affymetrix probe, that seems to miss part of the reason for the creation of custom CDF in the first place...
P.S. The links on the Brainarray website itself for querying probe set identities are broken.
I suppose that it depends on whether you are content with the original annotation or not, or if you are specifically choosing BrainArrays for some reason. If you want the 'new' annotation, then just use the CDF Seq Map Desc table for mapping.