I am interested in a genotype dataset on GEO (GSE26105) that was obtained using the Illumina Human610-Quad v1.0 BeadChip (GPL8887).
Thanks to the answer to my previous question, I was able to understand the platform data. But now I am struggling with the genotypes data because they are encoded as AA, AB or BB (no other information):
library(GEOquery)
gse.geno <- getGEO('GSE26105', destdir='.', GSEMatrix=TRUE)
mat.geno <- exprs(gse.geno)
I am aware of Illumina's technical note describing the TOP/BOT strand and the A/B alleles (and I think I understand it). But here, in the GEO dataset, there are only A's or B's, no TOP's nor BOT's. How/where can I find this information?
Thanks for your answer. I downloaded the file ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/snpArrayIllumina1M.txt.gz. Then, with a lot of "data mangling" in R, I kept only the non-ambiguous SNPs from the Illumina 610 chip for which I had strand information on the Illumina 1M chip.