Entering edit mode
5 months ago
subashini.fbtpb106
▴
20
I am using exonerate command in the pseudogene pipeline https://github.com/kelkar/Discover_pseudogenes. But the command results in KILLED message and it is not successful in ubuntu. What could be the reason? If storage/ RAM space is the reason, then what can be the solution?Is there any other way to generate this output.
exonerate --percent 60 --model protein2genome --showquerygff yes --showtargetgff yes -q QUERY_PROTEINS.fasta -t TARGET_GENOME.fasta --ryo "RYO\t%qi\t%ti\t%ql\t%tl\t%qal\t%qab\t%qae\t%tal\t%tab\t%tae\t%et\t%ei\t%es\t%em\t%pi\t%ps\t%g\nTransitionStart\n%V{%Pqs\t%Pts\t%Pqb\t%Pqe\t%Ptb\t%Pte\t%Pn\t%Pl\n}TransitionEnd\nTargetSeq\n%qs\nAligned Sequences\n>Q\n%qas\n>T\n%tas\nCoding Sequences\n>Q\n%qcs\n>T\n%tcs\n" > EXONERATE_OUTPUT.TXT
There are many reasons the program could be killed. First you should try if itexonerate works with a small query and reference genome. Use the same commandline but prepare a smaller test case.
If that works, try to run only a single sequence at a time. (Take only the first entry in QUERY_PROTEINS.fasta) If that works, you need to see if you can modify the pipeline to run each alignment separately.
Are you running it on a HPC cluster? Then the command could be killed because of running out of time. How much memory does you computer have? provide the output of
head /proc/meminfo
If your job gets killed because of low memory, it might be best to look for a larger computer with more RAM.
Here is the system memory.
That's about 62GB. I think it should be ok for up to human genome size. But depending on how much Now, please try to run a single sequence from the input against a single chromosome from the reference (ideally the one that contains it). Use blast, to quickly find the candidate source sequence.
In general "killed" results when the program tries to use resources beyond what is locally available. Explanations available here: https://stackoverflow.com/questions/726690/what-killed-my-process-and-why