Cmscan multithreading issue

0

Entering edit mode

7.0 years ago

Andrés Ribone ▴ 60

Hi I'm working in de novo transcriptome assembling from RNA-seq data. Right now I'm using Rfam cmscan to annotate 58808 contigs I got (N50=800). I'm working in a debian virtual machine in a cluster with 16 cores and 32 Gb ram.

Setting cmscan to use all the cores I have available when annotating the contigs, I see the use of each cpu is rather low, ~32 % each, never 100%. But when I tried splitting the big .fasta file in 16 subfiles, and then running cmscan parallely for each file in a subprocess, every core is used at 100%, and I got the results faster, (the -Z parameter in each thread was set equal to the whole file Mb*2).

However, I am not sure if this alternative is correct (I used the results from the not-splitted-fasta run). ¿What should I do next time with a similar task?.

Thanks in advance for reading!

infernal rfam cmscan transcriptome assembly • 1.4k views

ADD COMMENT • link 7.0 years ago by Andrés Ribone ▴ 60

Login before adding your answer.