i have a set of exon coordinates in human that i want to find the orthologs of in mouse, and vice versa, using pycogent. my reading of the docs suggests that i can just find the syntenic region for each exon in the alignment of human and mouse and that this should yield the orthologous exon coordinates. i tried this strategy like this:
compara = Compara(["human", "mouse"], Release=63, account=account)
regions = compara.getSyntenicRegions(CoordName="1", Start=4775654, End=4775821, align_method="PECAN", align_clade="vertebrate")
for my_region in regions: print my_region
This yields the error:
File "/usr/local/lib/python2.6/dist-packages/cogent-1.5.1-py2.6-linux-x86_64.egg/cogent/db/ensembl/compara.py", line 344, in getSyntenicRegions
ref_genome = self._genomes[_Species.getSpeciesName(Species)]
KeyError: 'None'
using python 2.6 and pycogent 1.5.1. any ideas what might be wrong? what's the easiest way to do this using pycogent? thank you.
edit: the solution is to pass Species="mouse" or Species="human" to getSyntenicRegions(). This works but it is far too slow.... are there better ways to do this efficiently?
Have you considered using EnsEMBL's Perl API?
One way to speed things up (as I do) is to download all the EnsEMBL data and have your own local copy of their MySQL server!?