Question

Cuffdiff Is Taking Forever. Can I Parallelize It?

1

Entering edit mode

13.4 years ago

Ryan Thompson ★ 3.6k

I am running cuffdiff on about 20 samples, and I simply started a single job for the whole genome. It is still running (and still producing output, so it isn't "stuck" or anything) 5 days later. I think I could parallelize it by splitting my input GTF file into many small sets of genes, but I am worried that it would produce incorrect FDR calculations and other "aggregate" statistics that I would not easily be able to correct when I merge the output.

Is there an easy way to parallelize cuffidff, or must I wait for my single job to finish?

cufflinks cuffdiff parallel • 7.4k views

ADD COMMENT • link updated 9.4 years ago by Biostar 20 • written 13.4 years ago by Ryan Thompson ★ 3.6k

1

Entering edit mode

Are you using the "-p" option to use multiple threads?

ADD REPLY • link 13.4 years ago by Aaron Statham ★ 1.1k

0

Entering edit mode

Hmm. I am running it on a cluster, and I have not experimented with multi-threading on the cluster yet (instead of one job with N threads, I run N jobs with one thread each). I know my cluster allows multi-threaded jobs, but only up to the number of CPU cores available on a single node. Still, it's better than nothing.

ADD REPLY • link 13.4 years ago by Ryan Thompson ★ 3.6k

0

Entering edit mode

In the end, where you able to get it to run successfully? If so, how long did it take (and how much memory/processors)? I'm currently running 45 individuals on 24 cores--which hasen't been that useful because after the mapping stage cuffdiff seems to revert back to 1 core.

ADD REPLY • link 12.8 years ago by Hc • 0

0

Entering edit mode

In the end, were you able to run it successfully? If so, how long did it take (and how much memory/processors)? I'm currently running 45 individuals on 24 cores--which hasen't been that useful because after the mapping stage cuffdiff seems to revert back to 1 core. Any thoughts appreciated!

ADD REPLY • link 12.8 years ago by Hc • 0

0

Entering edit mode

No, my internship ended. I'm now using Cufflinks again on a completely unrelated project, this time on a single 8-core workstation. We'll see how things go this time.

ADD REPLY • link 12.8 years ago by Ryan Thompson ★ 3.6k

score 2 · Answer 1 · 2011-08-09

2

Entering edit mode

13.4 years ago

Sean Davis 27k

Be sure to use the "-p" option, as Aaron suggests in his comment. And, yes, splitting into small sets of genes is not the way to go for the reasons that you suggest.

ADD COMMENT • link 13.4 years ago by Sean Davis 27k