How To Map Custom Microarray Platform To Current Ensembl Genome Build
2
4
Entering edit mode
13.7 years ago

I found a custom-designed microarray which uses some non-standard probe ids like JGIFrogGene99917 that I cannot map by id. I would like to map the probes to genes in a recent Ensembl version of the genome. Is there an automatic pipeline to do this?

(If there is no off-the-shelf solution: Would it be enough to align the probe sequences against the Ensembl transcripts, or should one align to the whole genome? I've seen Ensembl's instructions, but was wondering if there's an easier way that doesn't involve a custom environment.)

microarray annotation genome ensembl • 3.9k views
ADD COMMENT
3
Entering edit mode
13.7 years ago

Hi Michael

It is possible to use the Ensembl 'Micro Array mapping Pipeline' to generate Ensembl transcript level annotations. There is a little overhead in setting it up, but is fairly straight forward once you have the correct configuration in place. The custom 'arrays' environment is aimed at centralising configuration and providing useful command line functions to help run and trouble shoot the pipeline, so hopefully this should be helpful.

You would need to generate a standardised fasta format for the input which for this file would simply be something like:

>ArrayName:probe_id
GTACGATGCTAGCTATGCTATGATCTACGATGCT

Then you would need to add some config for this array to the following module:

ensembl-analysis/modules/Bio/EnsEMBL/Analysis/Config/ImportArrays.pm

The rest of the set up and config should be described in the pipeline documentation available here:

http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl-functgenomics/docs/array_mapping.txt?revision=1.18&root=ensembl&view=markup

Thanks

Nathan Johnson Ensembl Regulation

ADD COMMENT
2
Entering edit mode
13.7 years ago

Update:

I checked the SPOT_ID's do in fact map to Ensembl Xenopus genome. e.g.

http://www.ensembl.org/Xenopus_tropicalis/Lucene/Results?species=Xenopus_tropicalis;idx=;q=ENSXETT00000000002

Original answer:

In this specific case I think I would try to contact the submitters. This is an Agilent custom array for frog. Agilent probably started with a list of genes for which they created the probes. So both Agilent and the customer should have that list. But I am not sure Agilent will be allowed to give it to you without consent of the customer.

Also I am not sure whether ENSXE.... ID's really cannot be mapped somewhere using publicly available mappings for Xenopus. But you probably already tried that.

If you want to annotate the array I think its is best to Blast against Ensembl genes. Not whole genome because, while you might find an unknown gene now and then, you are more likely to find non-expressed sequence. I would only use transcripts (instead of full genes) if you think the array would be able to discriminate splice variants (which doesn't seem likely).

ADD COMMENT
0
Entering edit mode

Hi Chris, thanks for looking into this. Yes, I think I could map the probes that have ENSX... names, but I'm out of luck for the JGI... names.

ADD REPLY

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6