Question

Issue indexing a genome with bowtie2

0

Entering edit mode

3.9 years ago

Rox ★ 1.4k

Hello everyone,

I am stuck since quite some time on a side dish with Hi-C which require to first index the genome with bowtie2. I used it on some others references genome with no issues, but I just can't make it work for that one particular genome.

I use bowtie2-2.3.5.1, and my command is : bowtie2-build mother_raw_wtdbg2_58x_polished.fa bowtie2_index/mother_raw_wtdbg2_58x_polished.

This is the output I am getting :

Settings:
  Output files: "bowtie2_index/mother_raw_wtdbg2_58x_polished.*.bt2"
  Line rate: 6 (line is 64 bytes)
  Lines per side: 1 (side is 64 bytes)
  Offset rate: 4 (one in 16)
  FTable chars: 10
  Strings: unpacked
  Max bucket size: default
  Max bucket size, sqrt multiplier: default
  Max bucket size, len divisor: 4
  Difference-cover sample period: 1024
  Endianness: little
  Actual local endianness: little
  Sanity checking: disabled
  Assertions: disabled
  Random seed: 0
  Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
  mother_raw_wtdbg2_58x_polished.fa
Building a SMALL index
Reading reference sizes
  Time reading reference sizes: 00:00:33
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
  Time to join reference sequences: 00:00:25
bmax according to bmaxDivN setting: 660338289
Using parameters --bmax 495253717 --dcv 1024
  Doing ahead-of-time memory usage test
  Passed!  Constructing with these parameters: --bmax 495253717 --dcv 1024
Constructing suffix-array element generator
Building DifferenceCoverSample
  Building sPrime
  Building sPrimeOrder
  V-Sorting samples
  V-Sorting samples time: 00:01:27
  Allocating rank array
  Ranking v-sort output
  Ranking v-sort output time: 00:00:18
  Invoking Larsson-Sadakane on ranks
  Invoking Larsson-Sadakane on ranks time: 00:00:44
  Sanity-checking and returning
Building samples
Reserving space for 12 sample suffixes
Generating random suffixes
QSorting 12 sample offsets, eliminating duplicates
QSorting sample offsets, eliminating duplicates time: 00:00:00
Multikey QSorting 12 samples
  (Using difference cover)
  Multikey QSorting samples time: 00:00:00
Calculating bucket sizes
Splitting and merging
  Splitting and merging time: 00:00:00
Avg bucket size: 2.64135e+09 (target: 495253716)
Converting suffix-array elements to index image
Allocating ftab, absorbFtab
Entering Ebwt loop
Getting block 1 of 1
  No samples; assembling all-inclusive block

And this is the directory where it writes its outputs :

sbsuser@node125: /work/sbsuser/test/roxane/bowtie2 $ll
total 630M
-rw-r--r-- 1 sbsuser GET-PLAGE  72K Dec 10 10:48 bovin_genome_index.1.bt2
-rw-r--r-- 1 sbsuser GET-PLAGE    0 Dec 10 10:48 bovin_genome_index.2.bt2
-rw-r--r-- 1 sbsuser GET-PLAGE  43K Dec 10 10:48 bovin_genome_index.3.bt2
-rw-r--r-- 1 sbsuser GET-PLAGE 630M Dec 10 10:48 bovin_genome_index.4.bt2

So it feels ike he starts doing something, then for some reason he consider it's "empty" and stop indexing.

I have strictly no idea of what I am doing wrong and I have been pulling my hair too long on this... Can someone please help me pointing out the dumb mistake I am probably making ?

Have a nice day,

Roxane

software error bowtie2 • 1.5k views

ADD COMMENT • link 3.9 years ago by Rox ★ 1.4k

0

Entering edit mode

I think it is not an issue on your side. There are a couple of Github issues on this kind of error, e.g. https://github.com/BenLangmead/bowtie2/issues/194 that I would probably add a comment to it and see what the developers have to say.

ADD REPLY • link 3.9 years ago by ATpoint 85k