Entering edit mode
2.9 years ago
Peter
▴
20
Hello
I used the following commands:
hisat2_extract_splice_sites.py Homo_sapiens.GRCh38.80.gtf > splice_sites.txt
hisat2_extract_exons.py Homo_sapiens.GRCh38.80.gtf > exons.txt
hisat2-build referenceData/fasta/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa \
--ss referenceData/hisat2_index/splice_sites.txt \
--exon referenceData/hisat2_index/exons.txt \
referenceData/hisat2_index/GRCh38.hisat2
Settings:
Output files: "referenceData/hisat2_index/GRCh38.hisat2.*.ht2"
Line rate: 7 (line is 128 bytes)
Lines per side: 1 (side is 128 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Local offset rate: 3 (one in 8)
Local fTable chars: 6
Local sequence length: 57344
Local sequence overlap between two consecutive indexes: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
referenceData/fasta/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa
Reading reference sizes
Time reading reference sizes: 00:00:22
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
Time to join reference sequences: 00:00:16
Time to read SNPs and splice sites: 00:00:02"
It's been running for over 1 hour. In my directory the outputs were created:
GRCh38.hisat2.0.rf (27GB)
GRCh38.hisat2.1.ht2 (8.2kb)
GRCh38.hisat2.2.ht2 (0 bytes)
GRCh38.hisat2.3.ht2 (11.3Kb)
GRCh38.hisat2.4.ht2 (736 MB)
GRCh38.hisat2.7.ht2 (13.1 MB)
GRCh38.hisat2.8.ht2 (2.6 MB)
It hasn't given any errors yet, but I'm worried. It's my first time analyzing RNA-seq data, does anyone know what's going on?
Thanks!
As long as the program is working and producing output nothing to worry about. Be patient and wait.
One hour for a full human index with SNPs on a single core is not much. It will take some time. Coffee and wait.
I agree that 1 hour for hisat build human index is not that long, and since you are building ir with --ss and --exon it will need about ~200GB of RAM according to the manual. If you need an already build index, the hisat2 website (http://daehwankimlab.github.io/hisat2/download/) has a few ones ready for download.
Thanks
Worked well! I took out --ss and --exon
My output was 8 .ht2 files with different sizes
In my experience, you have a problem if the .ht2 index file is empty. This seems to crop up whenever you haven't used enough memory during the index build step, and you won't be able to align reads using that index until it's corrected.
For reference, I've only been able to make it through building a hisat2 index with 200gb of RAM, which lines up with recommendations in the manual.