It's always difficult to "guess the identifier" without additional context, but what I think you have there are Affymetrix transcript cluster IDs.
You should first download a probeset annotation file from Affymetrix (account required). In your case, I think this is the appropriate page. Scroll down to "Archived NetAffx Annotation Files".
I downloaded the zip file at the link HuEx-1_0-st-v2 Probeset Annotations, CSV Format, Release 32 (40 MB, 6/23/11) and unzipped it. Here's part of a grep for one of your IDs:
grep 2315633 HuEx-1_0-st-v2.na32.hg19.probeset.csv
"2315637","chr1","+","1167620","1167657","4","2315633","297","407","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 1 // 1 // 0 /// ENST00000379198 // chr1 // 100 // 1 // 1 // 0","3","2","4","1","extended","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","main"
"2315638","chr1","+","1167689","1167804","4","2315633","297","408","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 4 // 4 // 0 /// ENST00000379198 // chr1 // 100 // 4 // 4 // 0","1","2","0","2","core","0","0","1","2","0","2","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","main"
"2315639","chr1","+","1167873","1167951","4","2315633","297","409","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 4 // 4 // 0 /// ENST00000379198 // chr1 // 100 // 4 // 4 // 0","1","2","0","2","core","0","0","1","2","1","4","0","0","1","0","1","1","0","1","0","0","0","0","0","0","0","0","main"
Column 1 is the probeset ID. Now, the problem is that few ID conversion systems use transcript cluster IDs, but many use probeset IDs. So you could use, for example, the R biomaRt package as follows:
library(biomaRt)
mart.hs <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
# get probeset IDs for transcript cluster 2315633
huex <- read.table("~/Downloads/HuEx-1_0-st-v2.na32.hg19.probeset.csv", sep = ",", stringsAsFactors = F, header = T)
probes <- subset(huex, transcript_cluster_id == "2315633")$probeset_id
# get gene symbols
genes <- getBM(attributes = c("affy_huex_1_0_st_v2", "hgnc_symbol"), filters = "affy_huex_1_0_st_v2", values = probes, mart = mart.hs)
genes
# affy_huex_1_0_st_v2 hgnc_symbol
#1 2315638 B3GALT6
#2 2315642 B3GALT6
#3 2315639 B3GALT6
#4 2315643 B3GALT6
#5 2315644 B3GALT6
#6 2315640 B3GALT6
#7 2315637 B3GALT6
#8 2315645 B3GALT6
#9 2315641 B3GALT6
For more information, search this site for "biomart".
Neilfws: If you have an established pipeline for "Exon 1.0 ST arrays" analysis (by Oligo or any other package) then can you please share this information? Or if you can point me out towards such a tutorial. I tried to follow userguide of oligo package but it is so confusing for me. Thanks.
Hi Neilfws, I am trying to map HuGene-2_0-st (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL16686) using above mentioned script but I get the following error.
Error in getBM(attributes = c("HuGene-2_0-st", "hgnc_symbol"), filters = "HuGene-2_0-st", : Invalid attribute(s): HuGene-2_0-st
I also tried using _v1 or _v2. But no success. How I can locate the actual name. if you have any suggestion? Thanks