Question

Interproscan taking so much time

0

Entering edit mode

7 months ago

Mohamed Abderrahmane ▴ 20

I'm new to transcriptomic data analysis and currently in the process of running InterProScan (version 5.60-92.0) on my transcriptome, which consists of approximately 65,000 contigs with an N50 of 1,645 and L50 of 12,422. I'm utilizing the IFB cluster, allocating 40 CPUs and 50 GB of RAM for this task. However, despite the seemingly smooth progress, it's been running for three days now. I'm wondering if this extended runtime is typical, and if so, are there any strategies to make the computation process faster?

slurm interproscan transcriptomics • 464 views

ADD COMMENT • link updated 7 months ago by dthorbur ★ 2.5k • written 7 months ago by Mohamed Abderrahmane ▴ 20

score 2 · Accepted Answer · 2024-03-28

2

Entering edit mode

7 months ago

dthorbur ★ 2.5k

Parallelize. Chunk your transcriptome into groups of something like 5000 (or less if you want it to go faster), and run an array across your different input files at the same time. This type of processing will help in a lot of different bioinformatic analyses.

ADD COMMENT • link 7 months ago by dthorbur ★ 2.5k

1

Entering edit mode

Thank you for your reply. I have just executed the computation with parallelization as you mentioned, and I will observe the differences. Have a nice day!

ADD REPLY • link 7 months ago by Mohamed Abderrahmane ▴ 20