i have taken .soft data from GEO database, for my research work i need genomic location of each gene present in the affymetrix gene chip, for that i use biomaRt package, when i am mapping .soft file and the result of annotLookup table some probeid's are missing or they are not in the annotLookup table. how can i get the information of that genes. please help me. dateset is GDS3487.soft taken from GEO database for Example: 1552819_at present in .soft file but this probe id is not in annotLookup
following are the probe id's, which is not mapped with any of the genes in the refGene.txt data. Actually 13143 id's are not mapping with any of the genes in ENSEMBLE and NCBI dataset. some of the id's are:
can anybody help to find the reason for this, but the .soft file gives some name corresponding to these probe id's, anyidea from where they got the names.. please help me, according to ENSEMBLE only 27199 id's mapping to a genename balance probe id's lack there identity.
Can you show the code that you have used?
Hey, thank you very much.
I was able to annotate many of the probes that you listed using the following sequence of commands:
I checked 2 of the examples that fail to be annotated and they are both probes that target genes whose genomic regions appear to have been removed from GRCh38 (but that are present in GRCh37). If you go to the UCSC Genome Browser, you can simply search for these. I have initially searched for:
That may not be the complete story, though.
Note that you can download comprehensive annotation for this array version from the Affymetrix / Thermofisher support site: GeneChip™ Human Genome U133 Plus 2.0 Array
[the file you may want is likely the one called 'Current NetAffx Annotation Files: HG-U133_Plus_2 Annotations, CSV format, Release 36']
thank you kevin i am planning to moving forward with the dataset from Thermofisher support site: 'Current NetAffx Annotation Files: HG-U133_Plus_2 Annotations, CSV format, Release 36' , only 231 probe id is missing the external_ gene_name, but this is so far better than the previous dataset i was used, thank you once again.