Hey y'all,
I'm currently trying to make sense of the run time that it takes trinity to assemble a transcriptome. I know the general estimation for assembling the transcriptome is that it will take .5 to 1 hour per million reads.
I understand that if I had 40 million reads it should theoretically take 20-40 hours for Trinity to assemble the transcriptome. Does this mean that if Trinity was ran on 10 CPU cores that it should theoretically take 2-4 hours for the transcriptome to be assembled (barring any technically limiting steps) since the processing would be spread across the 10 cores?
Thank you all in advance.
Do you have access to a cluster, or machine with more compute threads? Test it, provide feedback.
I am not sure if one would be able to estimate run time for a process like Trinity unless you actually run a real test since much of this may depend on the hardware you have access to/complexity of your dataset. You also need 1G of RAM per million PE reads so that may also become a limiting factor.
Practically, start with as much resources as you can use instead of trying to estimate the run times.