I have some trouble when I try to annotate the probeset-level data on this particular chip: HuGene-1_0-st-v1.
The AffyIDs ranging: 7892501, 7892502, 7892503 ... 8180413, 8180415, 8180417, 8180418
Here are my unsuccessful attempts:
(1) using biomart with getBM function. With this approach 38% of the ~33200 probesets can be annotated
# replace the affyID with gene symbol
mart <- useMart(biomart = "ENSEMBL_MART_ENSEMBL",host = "www.ensembl.org", path = "/biomart/martservice", dataset = "hsapiens_gene_ensembl")
hgnc <- getBM(attributes = c("affy_hugene_1_0_st_v1", "hgnc_symbol","ensembl_gene_id","entrezgene","chromosome_name","start_position","end_position","band"), filters = "affy_hugene_1_0_st_v1", values=tab$ID, mart = mart)
# Now match the array data probesets with the genes data frame
m <- match(as.numeric(tab$ID), hgnc$affy_hugene_1_0_st_v1)
# And append e.g. the HGNC symbol to the array data frame
tab$hgnc <- hgnc[m, "hgnc_symbol"]
(2) using the NetAffy Annotation file from the Affymetrix Support section [1]. When I compare the ProbeIDs from the first line of the file with the ~33200 ProbeIDs from the experiment, the overlap is only 13%. The AffyIDs are starting with the values 7896739, 7896741, 7896743 ....
(3) Using getSYMBOL(head(fit$genes$ID), "hugene10sttranscriptcluster.db")
using library(annotate)
and library(hugene10sttranscriptcluster.db)
32% can be annotated, but this annotation seems not to be consistent with (1)
(4) Using (3) but instead of hugene10sttranscriptcluster.db
the library hugene10stprobeset.db
. Only 0.4% can be annotated due to the fact that hugene10stprobeset.db is for exon annotation
My question: Is there a way to annotate 100% of the AffyIDs with a Gene Symbol? And where are the annotation information for this?
Thank you in advance for your efforts!
I am not an expert. But in such conditions, the first step to check could be the packages. It would help if you were sure that the annotation package, Bioconductor, and annotation database packages are updated and versions are compatible. When you are sure about the versions and updates, then you can think about other possible reasons.