I work on non-model insect genomes. In order to look for orthologous regions across species I have had been blasting in the past. But I needed something more robust. I used the online version of mVista and that has been helpful but was recently advised to use liftover. Realised that I needed chain files for the same. However, when I reached out to UCSC they said they cannot make chain files for invertebrate genomes. So any help in making chain files for liftover will be appreciated? Or any help in finding orthologos non-coding regions across species will be appreciated.
My problem more specifically: I am trying to find orthology/synteny of 200kb region across multiple species for which I have the ATAC peaks of in order to look for similar reg-regions.
I have used the pipeline from UCSC (http://genomewiki.ucsc.edu/index.php/DoBlastzChainNet.pl ). It is a bit tricky to install (especially the parasol job submission system from UCSC) but it works like a charm. Gives all types of chains including liftover chains. No root access is needed to install parasol and I found it easiest to run the pipeline with parasol rather than to modify it to run without it.
As you pointed out, I have been having trouble to install the parasol job submission system. This is where I downloaded it from: https://genecats.gi.ucsc.edu/eng/parasol.html but this does not look like what is described in here: http://genomewiki.ucsc.edu/index.php/Parasol_job_control_system . Any help in downloading parasol and getting it to work will be appreciated
There is a direct script available to install it (http://genomewiki.ucsc.edu/images/e/ea/ParasolInstall.sh.txt ). It assumes a root access but can be modified easily to install somewhere else. It largely copies binaries from UCSC.
@microfuge I need little help with running this script. I have spent quite a lot of time to make it run, so don't want to stop here. I am having trouble with doClusterRun.csh step where it tries to run jobs in the job list (i.e running blasts-run-ucsc).
P.S I am using slurm wrapper for parasol. It successfully submit jobs to cluster but jobs fails. so its not the problem wit parasol/slurm, it has something to do with the running the job (which I believe is blast-run-ucsc)
Thanks in advance, will appreciate your help!
Hi , I also had most issues with the
run.blastz
part but I did not use slurm. Probably unrelated but one major change i did was to replace thetype
command in the perl scripts withwhich
as type is a bash builtin and not a binary and perl tries to execute this command.Can you let me know about the error message ?
Take a look at this: https://iamphioxus.org/tag/liftover/ It also has links for UCSC pages at the bottom of the post.
From my understanding, this is for 2 different assemblies for the same reference genome/organism. How different will it be for 2 distant species?
No. There are chain files that go across species. e.g. http://hgdownload.soe.ucsc.edu/goldenPath/hg38/liftOver/