Cufflinks: Long time for assembly-"Waiting for 1 thread to complete"
1
0
Entering edit mode
9.9 years ago
mjoyraj ▴ 80

I have been trying assembling RNA-seq reads with cufflinks with the following options:

cufflinks -p 8 -o output -g gtf.file --max-bundle-frags 1000000000000 --multi-read-correct acctpted.bam

It has been running for past 7 days and showing "Waiting for 1 thread to complete".

However, if I remove the -g option, it completes in 36 hours. Does anybody face a similar situation?

What could be the possible solution?

Assembly RNA-Seq • 4.3k views
ADD COMMENT
1
Entering edit mode

which organism are you assembling? is your gtf file correct? you want to leave the gtf file where it is (option -g) because that's the reference you are using.

I'm not sure why you are using that --max-bundle-frags number...do you really need it so high? I'd try leaving the default setting

also, I haven't used it in awhile but the -p should indicate multiple thread, are you using them correctly? if you are i.e. on a server you might have to talk to your IT ppl to make sure that the setting that you are using in cufflinks are not in contrast with what's actually happening once you launch it on that server.

7 days is definitely too much, not sure which computer/server you are using to assemble this but 7 days is a lot for nowadays standards!

ADD REPLY
0
Entering edit mode

Dear Tris, the organism is chicken. the gtf file is okay, because I used the same gtf file for mapping with Tophat. Also with the same gtf file I assembled using cufflinks in Galaxy. Does increasing --max-bundle-frags increase run time?

I am using a server that is having four computing nodes. Two (Za1, Za1) for assembly (512 GB RAM, 64 threads) and another two (Zm1, Zm2) for mapping (128 GB RAM, 32 threads). I am using node Zm1 as Za1 is having some problem. In the Zm1 node I am using 8 threads as per -p option.

ADD REPLY
0
Entering edit mode

How many threads did you reserve to run the analysis? with -p=8 and 32 threads you can run 4 analysis in parallel. if you reserved more than 4 then the analysis could end up being repeated by the other threads you reserved. if that happens then you increase memory usage and this might either cause the script to stop or to keep running until memory is freed from other threads not using it...do you have a log file from the server? not sure if this is the case but the log file could help

ADD REPLY
0
Entering edit mode

I am only running one with p=8. Sorry, I do not have log file.

ADD REPLY
0
Entering edit mode
8.3 years ago

This is an old question, but just in case people are still struggling with this behavior. I found that using igemones for reference genome and gene annotations eliminated the problem. Look here: http://support.illumina.com/sequencing/sequencing_software/igenome.html

ADD COMMENT

Login before adding your answer.

Traffic: 1810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6