Entering edit mode
7.0 years ago
rightmirem
▴
70
I was running hisat2-build
against a very large, multiple sample VCF file. After almost 3 days of running, it finally segfault 'ed...
hisat2-build --large-index -p40 --snp /(snip)/SIDXXX-A.merged.snp --haplotype /(snip)/SIDXXX-A.merged.haplotype /(snip)/GRCh38_87.fa /(snip)/SIDXXX-A.merged
Is it possible the indexes are still reliable (I.e. did it segfault AFTER the indexes were complete?...I can't tell fro the output...)
IS it possible to fix this?
I notice that it seems like HiSAT2 may not really be supported anymore by the developer...
OUTPUT
Settings:
Output files: "/(snip))/SIDXXX-A.merged.*.ht2l"
Line rate: 8 (line is 256 bytes)
Lines per side: 1 (side is 256 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Local offset rate: 3 (one in 8)
Local fTable chars: 6
Local sequence length: 57344
Local sequence overlap between two consecutive indexes: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
/(snip)/GRCh38_87.fa
Reading reference sizes
Time reading reference sizes: 00:00:45
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:20
Time to read SNPs and splice sites: 00:01:34
FINISHED RECURSIVE SORTS: 645
BUILD TABLE: 660
BUILD INDEX: 33
COUNTED NEW NODES: 20
COUNTED TEMP NODES: 0
RESIZED NODES: 76
RESIZED NODES: 0
MADE NEW NODES: 41
MERGEUPDATERANK: 123
TOTAL TIME: 1060
Generation 5 (3411609896 -> 3267718306 nodes, 2799078020 ranks)
ALLOCATE FROM_TABLE: 0
COUNT NUMBER IN EACH BIN: 5
FINISHED FIRST ROUND: 10
1 3267718306
0 0
0 0
0 0
(...snip...)
FINISHED RECURSIVE SORTS: 936
BUILD TABLE: 955
BUILD INDEX: 31
COUNTED NEW NODES: 5
COUNTED TEMP NODES: 0
Ran out of memory; automatically trying more memory-economical parameters.
Generation 0 (3021344366 -> 3021344366 nodes, 0 ranks)
COUNTED NEW NODES: 9
COUNTED TEMP NODES: 0
RESIZED NODES: 68
RESIZED NODES: 0
MADE NEW NODES: 11
Generation 1 (3035849135 -> 3035849135 nodes, 0 ranks)
COUNTED NEW NODES: 9
COUNTED TEMP NODES: 0
RESIZED NODES: 67
RESIZED NODES: 0
MADE NEW NODES: 10
Generation 2 (3064868525 -> 3064868525 nodes, 0 ranks)
COUNTED NEW NODES: 10
COUNTED TEMP NODES: 0
RESIZED NODES: 66
RESIZED NODES: 0
MADE NEW NODES: 17
Generation 3 (3122962997 -> 3122962997 nodes, 0 ranks)
BUILT FROM_INDEX: 27
COUNTED NEW NODES: 4
COUNTED TEMP NODES: 0
RESIZED NODES: 0
RESIZED NODES: 0
MADE NEW NODES: 1
RESIZE NODES: 5
COUNT NUMBER IN EACH BIN: 1
/var/spool/slurmd/job138212/slurm_script: line 9: 41849 Segmentation fault hisat2-build --large-index -p40 --snp /(snip)/SIDXXX-A.merged.snp --haplotype /(snip)/SIDXXX-A.merged.haplotype /(snip)/GRCh38_87/GRCh38_87.fa /(snip)/SIDXXX-A.merged
@OP I believe
STAR
is less memory-hungry thanHISAT
. Could try withSTAR
?