Question

Exonerate command in Ubuntu

0

Entering edit mode

5 months ago

subashini.fbtpb106 ▴ 20

I am using exonerate command in the pseudogene pipeline https://github.com/kelkar/Discover_pseudogenes. But the command results in KILLED message and it is not successful in ubuntu. What could be the reason? If storage/ RAM space is the reason, then what can be the solution?Is there any other way to generate this output.

exonerate --percent 60 --model protein2genome --showquerygff yes --showtargetgff yes -q QUERY_PROTEINS.fasta -t TARGET_GENOME.fasta --ryo "RYO\t%qi\t%ti\t%ql\t%tl\t%qal\t%qab\t%qae\t%tal\t%tab\t%tae\t%et\t%ei\t%es\t%em\t%pi\t%ps\t%g\nTransitionStart\n%V{%Pqs\t%Pts\t%Pqb\t%Pqe\t%Ptb\t%Pte\t%Pn\t%Pl\n}TransitionEnd\nTargetSeq\n%qs\nAligned Sequences\n>Q\n%qas\n>T\n%tas\nCoding Sequences\n>Q\n%qcs\n>T\n%tcs\n" >  EXONERATE_OUTPUT.TXT

Pseudogene-pipeline Exonerate Pseudogenes • 501 views

ADD COMMENT • link updated 5 months ago by Ram 44k • written 5 months ago by subashini.fbtpb106 ▴ 20

0

Entering edit mode

There are many reasons the program could be killed. First you should try if itexonerate works with a small query and reference genome. Use the same commandline but prepare a smaller test case.

If that works, try to run only a single sequence at a time. (Take only the first entry in QUERY_PROTEINS.fasta) If that works, you need to see if you can modify the pipeline to run each alignment separately.

Are you running it on a HPC cluster? Then the command could be killed because of running out of time. How much memory does you computer have? provide the output of head /proc/meminfo

If your job gets killed because of low memory, it might be best to look for a larger computer with more RAM.

ADD REPLY • link 5 months ago by Michael 55k

0

Entering edit mode

head /proc/meminfo
MemTotal:       65783196 kB
MemFree:        59901468 kB
MemAvailable:   61456904 kB
Buffers:           70404 kB
Cached:          1990788 kB
SwapCached:        28644 kB
Active:          1356896 kB
Inactive:        2930204 kB
Active(anon):     773808 kB
Inactive(anon):  1508172 kB

Here is the system memory.

ADD REPLY • link updated 5 months ago by GenoMax 147k • written 5 months ago by subashini.fbtpb106 ▴ 20

0

Entering edit mode

That's about 62GB. I think it should be ok for up to human genome size. But depending on how much Now, please try to run a single sequence from the input against a single chromosome from the reference (ideally the one that contains it). Use blast, to quickly find the candidate source sequence.

ADD REPLY • link 5 months ago by Michael 55k

0

Entering edit mode

In general "killed" results when the program tries to use resources beyond what is locally available. Explanations available here: https://stackoverflow.com/questions/726690/what-killed-my-process-and-why

ADD REPLY • link 5 months ago by GenoMax 147k

score 0 · Answer 1 · 2024-06-18

0

Entering edit mode

5 months ago

colindaven 7.0k

htop or bottom (a rust tool) is a good way to check memory usage over time when running a tool. If the tool uses 30->40->50->60 GB memory over time and always increasing it may be getting killed due to RAM
A more modern alternative to exonerate for protein mapping is miniprot.

ADD COMMENT • link 5 months ago by colindaven 7.0k