I've reanalysed the data to get differentially expressed transcripts, and now I'm trying to test pathway enrichment for some gene sets.
The problem I'm facing is that I need to identify which probes of the chip correspond to the genes in the sets I want to test. Considering the chip probes are annotated with old EMBL transcript ID (most of the id are like AAXXXXXX, AIXXXXXX, HXXXXX, NXXXXX, RXXXXX, TXXXXX, with numbers for Xs, for example I know that "AI375736" corresponds to CD28 gene).
I'm not really sure how to find a correspondance between the genes I want to study and these transcripts IDs.
If anyone has any advice on how to do that it would be very helpful.
The arrays are Agilent but do not appear to be supported in biomaRt. However, I note that these IDs that you list are likely GenBank accession IDs and not probe names.
Thank you for your response.
It's in those files that I found the IDs, the exact name of the column is "Reporter Database Entry[embl]", it's indeed not the probe name.
The array is quite old indeed. There are mappings to what appear to be gene descriptions, here:
Check the Excel files.
The arrays are Agilent but do not appear to be supported in biomaRt. However, I note that these IDs that you list are likely GenBank accession IDs and not probe names.
Thank you for your response. It's in those files that I found the IDs, the exact name of the column is "Reporter Database Entry[embl]", it's indeed not the probe name.
You may try to map them with this code, in that case:
I tried but failed. Some may map, though. Otherwise you may consider eUtils to map these to gene symbols.
thank you very much for trying, I will check other Ids to see if it could work