Exonerate command in Ubuntu
1
0
Entering edit mode
5 months ago

I am using exonerate command in the pseudogene pipeline https://github.com/kelkar/Discover_pseudogenes. But the command results in KILLED message and it is not successful in ubuntu. What could be the reason? If storage/ RAM space is the reason, then what can be the solution?Is there any other way to generate this output.

exonerate --percent 60 --model protein2genome --showquerygff yes --showtargetgff yes -q QUERY_PROTEINS.fasta -t TARGET_GENOME.fasta --ryo "RYO\t%qi\t%ti\t%ql\t%tl\t%qal\t%qab\t%qae\t%tal\t%tab\t%tae\t%et\t%ei\t%es\t%em\t%pi\t%ps\t%g\nTransitionStart\n%V{%Pqs\t%Pts\t%Pqb\t%Pqe\t%Ptb\t%Pte\t%Pn\t%Pl\n}TransitionEnd\nTargetSeq\n%qs\nAligned Sequences\n>Q\n%qas\n>T\n%tas\nCoding Sequences\n>Q\n%qcs\n>T\n%tcs\n" >  EXONERATE_OUTPUT.TXT
Pseudogene-pipeline Exonerate Pseudogenes • 504 views
ADD COMMENT
0
Entering edit mode

There are many reasons the program could be killed. First you should try if itexonerate works with a small query and reference genome. Use the same commandline but prepare a smaller test case.

If that works, try to run only a single sequence at a time. (Take only the first entry in QUERY_PROTEINS.fasta) If that works, you need to see if you can modify the pipeline to run each alignment separately.

Are you running it on a HPC cluster? Then the command could be killed because of running out of time. How much memory does you computer have? provide the output of head /proc/meminfo

If your job gets killed because of low memory, it might be best to look for a larger computer with more RAM.

ADD REPLY
0
Entering edit mode
head /proc/meminfo
MemTotal:       65783196 kB
MemFree:        59901468 kB
MemAvailable:   61456904 kB
Buffers:           70404 kB
Cached:          1990788 kB
SwapCached:        28644 kB
Active:          1356896 kB
Inactive:        2930204 kB
Active(anon):     773808 kB
Inactive(anon):  1508172 kB

Here is the system memory.

ADD REPLY
0
Entering edit mode

That's about 62GB. I think it should be ok for up to human genome size. But depending on how much Now, please try to run a single sequence from the input against a single chromosome from the reference (ideally the one that contains it). Use blast, to quickly find the candidate source sequence.

ADD REPLY
0
Entering edit mode

In general "killed" results when the program tries to use resources beyond what is locally available. Explanations available here: https://stackoverflow.com/questions/726690/what-killed-my-process-and-why

ADD REPLY
0
Entering edit mode
5 months ago
  1. htop or bottom (a rust tool) is a good way to check memory usage over time when running a tool. If the tool uses 30->40->50->60 GB memory over time and always increasing it may be getting killed due to RAM
  2. A more modern alternative to exonerate for protein mapping is miniprot.
ADD COMMENT

Login before adding your answer.

Traffic: 1866 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6