For the affy HG-U133A_2, I used biomaRt to retrieve the annotation. That being said, I am finding multiple entities for the same probe ID. But only one of these entries is represented in the official annotation file, provided by the vendor. For instance...
200033_at has 4 entries for that particular probe... including protein_coding and miRNA. Additionally, the chromosome_name for a couple of the results include HG183_PATCH (I am assuming this is an old-entry, which has been patched over and replaced by another entry?)
I ended up making a parser for the original annotation file, provided by the vendor, but it seems very strange that biomaRt would retrieve different entries for the exact same probe ID. I was curious if someone knows why this is occurring and anyway to avoid this behavior.
Code:
require("biomaRt")
mart <- useMart("ENSEMBL_MART_ENSEMBL", host="http://grch37.ensembl.org")
mart <- useDataset("hsapiens_gene_ensembl", mart)
x <- listAttributes(mart)
annotLookup <- getBM(mart=mart, attributes=c("affy_hg_u133a_2", "ensembl_gene_id", "gene_biotype", "external_gene_name","chromosome_name","start_position", "end_position","strand"), filter="affy_hg_u133_plus_2", values=rownames(exprs(gset)), uniqueRows=TRUE)