Hello there,
I need to find exons that are conserved between multiple species, such as human and mouse. I have tried to liftOver
, but even though when I tweak the parameters to make it more sensitive, I lose most of my exons. To complement this approach, I am trying blat
and it seems that is working fine. My only problem is that this approach sounds quite old fashion to me, does someone know the other way find conserved exons across species? I am sure there are many ways to do this, I am interested to know which would be your choice as a bioinformatician.
Cheers!
PD: At the moment I only want to map 20967
exonic sequences and blat is fast enough when I do:
blat $Genome $Query $out -t=dna -q=dna -stepSize=5 -minScore=0 -minIdentity=0 -repMatch=1000000 -noHead
Perhaps even if you know if a better configuration of blat to do such mapping, would be very useful to star exploring this.
Update: With liftOver 856 mouse exons that overlap with annotated exons, whereas with blat I was able to find 11215 mouse exons that are conserved in human. Blat was not a bad option after all!
Between human and mouse, the published set is generally good enough. As the third-codon position in a coding exon is the least functionally constrained, liftover can fail in capturing some regions in the 2nd species, especially among those that are phylogenetically distant. I've been mostly using liftover and blat over the years as well. With liftover, I generally follow a reciprocal best-hit strategy, i.e. conserved regions that are lifted from species 1 to 2 must be lifted similarly and uniquely in the reverse direction.
any links to this "published set"? I have only seen public lists of gene homologies, but they are not at the exon level.
This is one (https://www.ncbi.nlm.nih.gov/pubmed/22369432) and you can download the set at http://tdl.ibms.sinica.edu.tw/OrthoExon/download.html. IIRC, the annotations were hg18 and mm9, so you might need to lift it to newer versions.
Earlier this year, I was looking for ways to identify orthologous splicing events and found this paper. https://www.biorxiv.org/content/biorxiv/early/2018/03/06/277723.full.pdf. Haven't read it yet but thought it could be of interest to you as well.
Thanks a lot Eric! These articles looks very interesting, they are exactly on the spot of my question. I am happy other people is already putting efforts to implement fine-scale orthology mapping at the exon level.