Question

Problems Identifying The Right Version Of Chip: Ht-Hgu133A Vs Ht-Hgu133A V2

1

Entering edit mode

11.9 years ago

Kenneth Daily ▴ 50

I'm trying to use some cancer cell line data from the paper "Systematic identification of genomic markers of drug sensitivity in cancer cells":

http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-783/

The title says that it is HT-HGU133a V2; but the array accession linked to it says it's the HT-HGU133a. I thought that some probes/layout had been changed between the U133a and U133a2 chips? I want to be able to add in our cell line data that was on the HGU133a V2 platform.

I then found this BioC thread that seemed to say that the annotation to use would be hgu133a2, but that a separate annotation was made for hthgu133a to 'reduce' confusion:

https://stat.ethz.ch/pipermail/bioconductor/2009-April/027250.html

But I'm still confused! When I load the raw CEL files and then normalize it with the different CDFs (hthgu133a vs. hgu133a2) I get completely different results; the normalized expression values for a single sample normalized with the different CDFs have absolutely no correlation. It makes me think that the layout is somehow different! Though, in the resulting ExpressionSets the probes are in the same order and exactly match. Can anyone offer a suggestion, or is there a way that I can compare the layout for the chips?

I actually also tried this in Partek; it failed when I manually changed the annotation to use to HGU133a V2 (said that the dimensions were incorrect).

microarray affymetrix • 2.6k views

ADD COMMENT • link updated 11.9 years ago by Charles Warden 8.3k • written 11.9 years ago by Kenneth Daily ▴ 50

score 1 · Answer 1 · 2013-06-07

If you are using Partek, it should automatically recognize the type of .CEL file (and, most likely, download the appropriate annotation file if it is not already in your libraries file). I would not recommend manually changing the annotation type if Partek is able to recognize it automatically.

Just to be clear, these are HT-HGU133A arrays (not the regular HGU133A arrays). These are often used for large genome projects (like TCGA, before they switched over to RNA-Seq). I think the reason is that you can process more arrays at the same time. I think the probes are essentially the same, but the layout may be different. I'm not sure what are the differences (if any) for the V2 HT-arrays. For example, you can't specifically order a V2 HT-array: http://www.affymetrix.com/estore/browse/products.jsp?productId=131441#1_1

So, I would probably just use Partek (with the automatically recognized annotation file). If you are really worried, there should also be processed expression values in the ArrayExpress project (so, you don't have to worry about doing your own mapping - you just have to trust that proper normalization was performed).