Question

Srma (Short Read Micro-Aligner) Optimising For Speed

2

Entering edit mode

13.8 years ago

Travis ★ 2.9k

Hi all,

I am wondering if anyone has experience using this package for local realignment of reads? I am wondering how best to tweak it for speed. Can the standard parameters be used to get good gains in run speed or is parallelization the best route?

Thanks in advance.

short alignment snp next-gen sequencing • 3.7k views

ADD COMMENT • link updated 10.8 years ago by Biostar 20 • written 13.8 years ago by Travis ★ 2.9k

score 1 · Answer 1 · 2011-06-28

1

Entering edit mode

13.8 years ago

Drio ▴ 920

Loca realignment is computationally intensive. Follow the parallel route. I'd suggest you split alignments by regions (1Mb regions were use in the tools' paper) then compute the local realignments with SRMA using 1 core per split. With that method the author was able to compute the realignments for all his data in the U87MG genome in about 87 Hours.

ADD COMMENT • link 13.8 years ago by Drio ▴ 920

0

Entering edit mode

So a best case scenario is approx 87 hours for a human genome? Is there a faster way of doing this when many samples are being used? For example, only considering the areas where variants are called rather than analysing the entire files?

ADD REPLY • link 13.8 years ago by Travis ★ 2.9k

0

Entering edit mode

Absolutely, SRMA allows you to focus the realignment in specific regions of the genome. That will reduce dramatically the running times.

ADD REPLY • link 13.8 years ago by Drio ▴ 920

0

Entering edit mode

What format are ranges specified in? I can't seem to figure it out.

ADD REPLY • link 13.8 years ago by Travis ★ 2.9k

0

Entering edit mode

Actually I can manage to run a range using RANGE=chr10:9271174-9271274 for example. But what if I just want to run an entire chromosome? Can that be easily specified? Also, is it acceptable to join multiple commands/ranges on the comman line using '&' so that each range uses a single thread?

ADD REPLY • link 13.8 years ago by Travis ★ 2.9k

0

Entering edit mode

It is acceptable assuming you have multiple cores on that machine. Storage is important also. If possible run your processes against local disk instead network storage.

ADD REPLY • link 13.7 years ago by Drio ▴ 920