Hi everyone,
I am running an alignment to the Anna's Hummingbird reeference genome using files of 10 runs that range from 7.6 to 10.2 Gb in size, and 2.2-2.9 giga bases in length.
I am using the command bwa mem -t 24, running on a cluster, using 24 cpus and 12 gb of memory per cpu (the max I could request for the job).
How long should I expect this to take? I've read across different forums of people generating a sam file every few minutes to every few hours or so, but I haven't generated a single sam file and it has been nearly 24 hours. It seems to be stuck on a specific step "M::process read 16790 sequences (240019667 bp)... and I'm not at all sure why.
Thanks for your input!
Are you piping to samtools, or outputting directly to a sam file? What is you exact command? Are the reads files located at a NFS share?
Side note: I would think (depending on drive speed characteristics) beyond 6-10 cpus is pointless, as probably the limiting factor would be disk IO.
The read files are on the cluster server-as far as I know, I am not using an NFS share. I'm running the following (so I don't think I'm piping to samtools-I am outputting a sam file directly):
then I plan to run the following:
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.