bwa index not generating, taking long time
2
0
Entering edit mode
4.3 years ago
ccha97 ▴ 60

Hi there, I'm using a program called Taiji which utilises BWA. Currently it is generating a BWA_index for my ATACSeq data, and I am just wondering how long it is supposed to take? It has been on this for almost 12+ hours, and I only have 6 narrowPeak files (mouse samples) as input. At first I thought it was a memory issue, so I reran the entire thing on a new disk (which still has 142 GB available) but still get the same issue. Any suggestions?

[bwa_index] Pack forward-only FASTA... 18.27 sec

enter image description here

alignment bwa assembly index • 4.0k views
ADD COMMENT
0
Entering edit mode

Does the pipeline allow to provide external indices? If so, build them externally first. It is unfortunate that a wrapper tries to do these things all in a single run. If something fails it has to start over from scratch.

ADD REPLY
2
Entering edit mode
4.3 years ago

The index should be built on the reference genome not on your data.

ADD COMMENT
0
Entering edit mode

Thank you for your answer. I didn't know that, but the program (Taiji) is running it automatically so I assume it's being built on the mm10 reference genome (?) as the reference genome was one of the inputs. Is there any other possible explanation as to why it's stuck on this specific line?

ADD REPLY
1
Entering edit mode
4.3 years ago
Mensur Dlakic ★ 28k

Is there any other possible explanation as to why it's stuck on this specific line?

It has completed that line, so it is stuck on what comes next. Here are all bwa index lines when indexing a small file:

[bwa_index] Pack FASTA... 0.01 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=3718406, availableWord=4724070
[bwt_gen] Finished constructing BWT in 5 iterations.
[bwa_index] 1.11 seconds elapse.
[bwa_index] Update BWT... 0.01 sec
[bwa_index] Pack forward-only FASTA... 0.01 sec
[bwa_index] Construct SA from BWT and Occ... 0.20 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index -a bwtsw group_00.fa
[main] Real time: 1.638 sec; CPU: 1.342 sec

Since the first bwa index line in my file took about 1s (total time was 1.6s) and yours was >3500, by extrapolation the whole process in your case should take about 1.5 hours. This is assuming that indexing time is linear - don't know if that's true - and that you have enough RAM to index this file in memory - don't know if that's true either. If your computer is short on memory, it may be swapping which can take very long time.

I suggest you find out how much your computer is swapping during this indexing operation:

swapon -s

This is what my computer shows at the moment:

Filename                                Type            Size    Used    Priority
/dev/sda1                               partition       97654780        280576  -2

As you can see, swap disk utilization here is less than 1% and I suspect yours will be much higher. Or try free -m which will show both RAM and swap usage:

              total        used        free      shared  buff/cache   available
Mem:         257873      139270       54222           8       64380      116872
Swap:         95365         274       95091
ADD COMMENT

Login before adding your answer.

Traffic: 1737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6