how can I convert gene symbol to affymetrix IDs ?
3
2
Entering edit mode
9.7 years ago
Mo ▴ 920

I have a list of Gene IDs from human, I would like to convert it to Affymetrix IDs.

Examples gene IDs

GNAI1
HLA-DQA1
WDR1
LTBP2
MXRA5
MMS19
SEC13
C6
IL16
FAM82A1
GBE1
TUBAL3
CYB5R3
ANXA1
APCS
COL14A1

Do you know how I can do it in R and based on website?

I have tried the following ID convertors online but none of them worked

David only convert to three Affymetrix such as 3PRIME_IVT_ID, EXON_gene_ID and SNP_ID

I also tried Ensemble Biomart like below but did not convert anything

  1. First go to the website
  2. Select " Ensemble Genes 78" and "homo sapiens genes (GRCh38)"
  3. Then click on "filter" ------> "gene" -------> and then paste the example list of genes there
  4. Then "Attribute" ------> "External" ------> and select "Affy HG U133A probeset" Then click on "Results" but nothing showed up
microarray r affymetrix • 12k views
ADD COMMENT
1
Entering edit mode

If nothing showed up then you did something wrong. I used your example gene symbols (not IDs) at Ensembl BioMart and the conversion works fine (click "Results" after loading this URL). Perhaps you forgot to specify "HGNC symbol(s)" as the ID type in Filters?

This can also be done using R/biomaRt. Please search this site for numerous answers to similar questions.

ADD REPLY
0
Entering edit mode

Which organism and which array? There are annotation packages in Bioconductor for most common ones that can facilitate the conversion.

ADD REPLY
0
Entering edit mode

@Devon Ryan human cell lines like promyelocytic leukemia, breast adenocarcinoma. HG-U133A microarray platform

ADD REPLY
3
Entering edit mode
9.7 years ago
komal.rathi ★ 4.1k

This will give you Affy Probe ID mappings to the Gene Symbols for HG-U133A microarray platform:

source("http://bioconductor.org/biocLite.R")
biocLite("hgu133a.db")
library(hgu133a.db)
library(annotate)
x <- hgu133aSYMBOL
# Get the probe identifiers - gene symbol mappings
mapped_probes <- mappedkeys(x)
# Convert to a dataframe
genesym.probeid <- as.data.frame(x[mapped_probes])
head(genesym.probeid)
   probe_id symbol
1   1053_at   RFC2
2    117_at  HSPA6
3    121_at   PAX8
4 1255_g_at GUCA1A
5   1316_at   THRA
6   1320_at PTPN21

Then you can subset this dataframe using your gene list.

ADD COMMENT
0
Entering edit mode

@komal.rathi Thanks for your message Lets assume your solution works. look the list is empty

test <- structure(list(V1 = structure(c(8L, 9L, 16L, 11L, 13L, 12L, 14L, 
3L, 10L, 6L, 7L, 15L, 5L, 1L, 2L, 4L), .Label = c("ANXA1", "APCS", 
"C6", "COL14A1", "CYB5R3", "FAM82A1", "GBE1", "GNAI1", "HLA-DQA1", 
"IL16", "LTBP2", "MMS19", "MXRA5", "SEC13", "TUBAL3", "WDR1"), class = "factor")), .Names = "V1", class = "data.frame", row.names = c(NA, 
-16L))

x <- test
# Get the probe identifiers - gene symbol mappings
mapped_probes <- mappedkeys(x)
# Convert to a dataframe
genesym.probeid <- as.data.frame(x[mapped_probes])
head(genesym.probeid)

# data frame with 0 columns and 6 rows
ADD REPLY
0
Entering edit mode

hgu133aSYMBOLmaps the symbols to probe IDs, what I would do is keep everything as @komal.rathi and then do:

genes <- c("ANXA1", "APCS", "C6", "COL14A1", "CYB5R3", "FAM82A1", "GBE1", "GNAI1", "HLA-DQA1", "IL16", "LTBP2", "MMS19", "MXRA5", "SEC13", "TUBAL3", "WDR1")
results <- genesym.probeid[which(genesym.probeid$symbol %in% genes),]

that should make it

ADD REPLY
1
Entering edit mode

Mo test, as defined by you, is a dataframe. x, as defined by me, is an object of class ProbeAnnDbBimap. Hope you understand why the list looks empty.

TriS That's what the last line in my answer implies.

ADD REPLY
0
Entering edit mode

@komal.rathi thanks in general you first make the list in genesym.probeid then I should extract my list based on that

@TriS thanks

ADD REPLY
1
Entering edit mode
9.7 years ago
TriS ★ 4.7k

Biomart is probably what you look for

-- edit --

I saw you used it from ensembl, I believe this is easier to use.

using the list you provided it worked for me for Affy hgu133 a2

ADD COMMENT
0
Entering edit mode

@TriS Yes this is easier to do, but many IDs are missing. do you know a way to do it in R? The order of the list changes and there are many overlapped names as well

ADD REPLY
0
Entering edit mode
9.7 years ago

David conversion tool ? http://david.abcc.ncifcrf.gov/conversion.jsp

ADD COMMENT
0
Entering edit mode

@Geek_y There are quite some limitation with david. e.g. limited to 1000 genes, not up dated etc. you can see above that I updated my question

ADD REPLY

Login before adding your answer.

Traffic: 1405 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6